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JOB SATISFACTION, JOB PERFORMANCE, 
AND SITUATIONAL CHARACTERISTICS 


RAYMOND A. KATZELL, RICHARD § 
Research Center for Indu 

Previous research has yielded inconsistent 
results on the relationship between job satis- 
faction and job performance of industrial 
workers. research on this 
topic have identified studies in which no sig- 
nificant relationship was apparent, others re- 
porting positive correlations, and even some 
in which an inverse relationship was found 
(Brayfield & Crockett, 1955; Herzberg, 
Mausner, Peterson, & Capwell, 1957). 

Attempts to interpret these facts have led 
to theoretical statements which postulate the 
influence of a number of additional variables 
on the relationship between satisfaction ard 
performance (Brayfield & Crockett, 1955; 
March & Simon, 1958; Morse, 1953). Among 
these are the motivations, expectations, and 
aspirations of the workers and the rewards 
obtainable through the various modes of be- 
havior possible in the work situation. 

Our review of extant research and theory 
on the subject has led us to a general model 
in which the work situation is regarded as a 
system having as separate outputs employee 


Recent reviews of 


job satisfaction and performance, and as in- 
puts characteristics both of the working envi- 
ronment and of the employees. Various of the 
inputs may be expected to affect either or 
both of the two of via their 
effects on employee motivation, ability, or 
both. Furthermore, the inputs may be inter- 
active in their effects. It seems to us that this 
model can accommodate the diverse findings 


sets outputs 


regarding the relationship between job satis- 
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faction and performance in various situations. 

However, most of the on which 
present information on the subject is based 
has not fulfilled the requirements of such a 
model. Previous research has typically in- 
volved in which 
characteristics of employees or the work envi- 
ronment have been correlated with job satis- 


research 


relatively simple designs 


faction or performance, or in which satisfac- 
tion and performance have | correlated. 
Katzell (1957) has already pointed out that 
not relation- 
ships between employee attitu: and per- 
research 


een 


much can be learned about the 


les 
formance from simple two-variable 
designs of this type. 

The detailing of the general model outlined 
would entail 
simultaneously 
outputs 


1 


; ela t od a 
ibove research which examines 


data on various and 


The accumulation 


inputs 
of h informa- 


suc 
tion should eventually lead to specification of 
the input conditions under which particular 
aspects of satisfaction and perforn 


ance are 


positively correlated, negatively correlated, or 
uncorrelated. 

lies within 
the framework of this model by undertaking 


The investigation reported here 
an analysis of employee satisfactions and per- 
formances in relation to 
istics of the work situation 


character- 
We were fortunate 
to encounter an industrial setting which not 
only enabled the collection of these kinds of 
data, but which also presented several addi- 
tional advantages. There 


certain 


a sizeable num- 
ber of work groups all performing essentially 
Com- 
these 


Was 


the same work with the same methods. 
mensurable available 
groups on several objective measures of job 


data were for 
performance. In spite of these uniformities, 


there were appreciable differences among the 
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groups in several aspects of their work situa- 
tions, including group size, ratio between male 
and female employees, wage rates, whether or 
not the group was unionized, and the size of 
the city in which the organization was located. 

In brief, then, it was the object of this 
investigation to illuminate the relationship 
between job satisfaction and performance, 
through a research design that conceived of 
these variables as outputs of a system which 
had as inputs characteristics of the work 
situation. 

MEeETHOD 

Research Se tting 


The study was conducted in an industrial company 
which had 72 wholesale 
cerned with the storage 


warehousing divisions con 
and distribution of drug and 
pharmaceutical products. The were geo 
graphically decentralized throughout the United 
States, but operated under standard work method 
and procedures that had been installed on a 
pany-wide basis. The key production personnel of 
the warehouses were order-pickers, who filled orders 
from stock on the basis of a f detailing the 
and their quantities as ordered by h customer 
This situation lent itself to objective 
the quantity, quality 
in each division, by measures to be described below 
The warehouses had a mean of 35 
ers each, with a range from 13 to 83 
of 40 of the warehouses were represented by 
unions 


divisions 
com 


items 


measurement ol 


and profitability of production 


production work 
The employees 


loc al 


Attitude Survey 


In 1956, an attitude 
Kolstad Associates among the 
warehouses 


Survey was condu 
these 


A questionnaire was administered to all 


employees of 


employees (other than absente¢ under 
and conditions of 
tiple-choice iten concerning — the 
satisfactions. A 
yielded the 
of the job, job periormance 


immediate 


supervision 


inonymity. It contained 47 mul- 
employees’ job 
juestions 
nature 
teamwork 


promotion op 


content analysis of the 
following categories coverage 
pay, benefits, 
supervision, 
portunities, working conditions, 

tions. It may be noted that this 

type of individual item, are of the 


management 
and communica- 
and the 
sorts usually en 


countered in job satisfaction surveys. These 


coverage, 


attitude 
data, together with concurrent performance measures 
and situational later 
able to the present investigators for 


avail- 
research 


information, were made 


purposes. 


Performance Measures 


The following measures were 
the investigators and c¢ 
senting meaningful and objective 
formance, be largely 


selected jointly by 


mpany management as repre 
indicators of pet 
deemed to 


de pen lent on the 


behavior of line employees in the warehouse and to 
be comparable among the warehouses. Each measure 
was computed as an figure for each wart 


house as a whole, for t SO ¢ ilendar vear 


Quantity number of products processed 
production 


filling orders per 


orders per man-hour of 
Quality = number of errors in 
work 
profit 
of the division as a ratio to total 
this is affected not 


formance but 


man-hours of 
Profitability = net from operation 


lollars of 


realized 
sales; 
only b 


figure warehouse per- 


also by variations in 


gross 


wage rates, occupanc costs, ett 


productivity 
man-hour of production 


Turnover = additions to 


Product-value 


work force per 
expressed as a percentage of total number employed; 
imperfect since it 
reflects expansion as well as replacement and makes 


terminations 


quarter 


this is an measure of turnover, 


no distinctions as to the reasons for 


Situational ( “hara te ristic .) 


recalled that our theoretical model 


includes variables which may be 
affect 


formance. Th 


expected to 


either or both employee satisfaction and per 


selection of such variables for investi 


gation, if it is not to be done on purely pragmatic 


ground looking into the “black box” of 


dynam 


may be 


the system and envisioning its properties 


conceived as 


{mong those properties which 


relevant both to employee satisfaction and perform 


anc are those relating to the motivations of the 
employees. Situational characteristics 


bearing on 
the include 


reflecting the 

s and expectations ol the employee sample and 
those describing the incentives prevalent in the work- 
ing environment. Limited by the after-the-fact 
nature of the study, we in available 
relevant 


isures that 


sé motivations might those 
need 


therefore 

could be re 

notivational considerations 
] +} 


folloy ’ 
following 


in which warehou 


tal straight-time earnings expressed 
straight-time 

f7on whether or not 

represented by a union 


} 
mate 


man-hours worked 
warehouse em 

S were 
Percentage 


percentage of warehouse em- 


es who were men 

compiled on 
technological 
is supplementary data 


of the 


Descriptive information was also 


company and on social and 


policies 


features of the warehouses 


urding characteristi work situation 


RESULTS 


Various correlational analyses were per- 
formed within and between the data on job 
satisfaction, job performance, 
variables 


and situational 
as specified below. 
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Job Performance and Situational Variables 


Product-moment correlations are reported 
in Table 1 five job performance 
and five situational variables noted above. 

Quantity, Profitabiilty, and Product-value 
productivity highly _intercorrelated. 
Product-value productivity highly 
related to the other, two that we eliminated 


among the 


were 


was SO 


it from subsequent analyses, particularly since 
Quantity and Profitability are of more basic 
interest both to 
chologist. 


management and the psy- 
Quality and Turnover were both 
independent of all 


measures. 


the other performance 


By breaking down the annual data into 
their component four quarters, it was possible 
to estimate the reliability of three of these 
varialeles Horst’s (1949) 
method. The resulting reliability coefficients 
were: Quantity, 0.87; Quality, 0.83; Turn- 
Whereas the reliabilities of the 
first two variables are satisfactory, the Turn- 


dependent using 


over, 0.61. 


over measure is rather unstable. This may 
help account for its unrelatedness not only to 
the other performance measures, but also to 
the situational and 
be discussed below. 


The 


siderably intercorrelated. Some of these rela- 


satisfaction variables to 


five situational variables were con- 


tionships are not surprising: larger ware- 


0 


houses are needed in larger communities; the 
more likely 
to be unionized, since urban culture is com- 
both linkage 
among Percentage of male employees, Unioni 
zation, and Wage rate, 
mutual accommodation or 


larger warehouses are also those 


mon to characteristics. The 


suggests some sort ol 
integration amo! 
personal and plant characteristics; perhay 
this is the result of 
lection of employees or 


in plant policies 


such processes as Seli-s¢ 


necessary adjustment 


Many significant correlations emerged from 
the comparison of performance measures with 


line with our expecta 


situational variables, in 
tions. In general, Quantity, Product 


-value 
productivity, and Profitability were found to 


be lower in warehouses which | 


rates, are larger in size and 


communities, are unionized, a1 


proportion ot 
A centroid factor 


} 
help rey 


mnin 
n ale employees 


on these data to 
structure of the relationshiy 
emerged which upon 

a single common f 

in Table 2, 


noted in 


and 


the preceding | 
The pattern of factor loadings o 


¢ 


tional characteristics suggests 


represents the degree of urbani 
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TABLE 2 
RotaTepD CEeNTROID Factor LoOAp! 


SITUATIONAL AND PERFO 


Quantity 
Quality 
Profitability 
Turnover 
Size of worl 
City size 
Wage rate 
Unionization 


Percentage 


setting in which the division is located: low 
urbanization being represented by relatively 
small community population, few employees 
in the division, lower absence of a 
union, and lower proportion of male em- 
with urbanization 
syndrome is the adequacy of warehouse per- 


wages, 


ployees. Associated this 
formance. Warehouses characterized by rela- 
tively little urbanization were likely to show 
superior performance (especially financially), 
including some tendency toward low turn- 
over; however, quality of production is 
independent of this factor. 


Relations to Job Satisfaction 


Relationships between the attitude ques- 


tionnaire data and the performance and situa- 


tional variables were analyzed on an item-by- 
item basis. 


This was done partly because of 
interest in 
partly because we 


our specific relationships and 
were not certain of the 
grouping of questionnaire items into an over- 
all attitude scale or subscales. 

The item score used in this analysis for 
each warehouse was computed in the follow- 
ing way. All items contained subjectively 
scaled response alternatives running, for ex- 
ample, from “I like my job very much” to 
“T dislike it very much.” A cutting point was 
fixed for each item on an a priori basis which 
was intended to divide the distribution of 
responses of all employees as nearly as pos- 
sible into favorable and unfavorable halves. 
The percentage of employees in a given ware- 


house whose responses fell in the favorable 
part of the distribution of an item was used 
as the item score for that warehouse. In 29 
of the 47 the “favorable” category 
turned out empirically to contain between 
50% and 79% of all respondents; in 10 
items, this percentage was less than 50, and 
in 8 items it more. Variability 
among the warehouses in the percentages of 
favorable responses to items was adequate, 
the standard deviations of item scores ranging 
from 6.8 to 21.5 with a median of 13.7. 

(It may be of interest to the reader to 
know the level and variability of aggregate 
job satisfaction scores, although these were 
not used in the correlation analysis for 
reasons previously stated. Such an aggregate 
score was computed for each warehouse by 


averaging the 


items, 


was 80 or 


favorable re- 
sponses to all 47 items. The distribution of 


percentages of 


these scores among the warehouses was ap- 
proximately normal, and ranged from 43 to 
83. The mean of this distribution was 61.9, 
and the standard deviation was 8.7.) 

The product-moment correlation between 
each of the 47 questionnaire items and each 
of the performance and situational variables 
was computed. A summary of the results is 
furnished in Table 3. In this table, a positive 
correlation refers to one in which greater job 
satisfaction, as represented in the item re- 
directly with higher 
scores in the performance or situational vari- 
able; a negative 


sponse, is associated 


correlation expresses an 
inverse relationship between satisfaction and 
the other variable. 

Table 


satisfactions were positively associated, be- 


3 reveals that, on the whole, job 


yond chance expectancy, with two aspects of 
performance, Quantity and _ Profitability. 
There was no relationship between job 
satisfaction and either Quality or Turnover. 

It is also clear that job satisfactions were 
associated with situational characteristics of 
the division. There was typically higher job 
satisfaction in situations that had the ear- 
marks of small town culture than in those 
with urban characteristics, i.e., having more 
employees, a large city location, higher wages, 
union representation, and _ proportionately 
more male employees. 

Examination of the 


item correlations in 
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rABLE 3 


oF ATTITUDE ITEM 


r> so 


\ 


ot it 


ANI 


en 


terms of content areas of the questionnaire 
revealed no consistent trend for satisfactions 
with given aspects of the job to differ from 
one another in their relationships to perform- 
ice or situational variables. In general, then, 
employees in the small town situations were 
more satisfied with various major aspects of 
their jobs, including supervision, pay, etc. 

As aid in comparing these results with 
those of other it should be noted 
that the correlations reported in the last 
column of Table 3 means of the 47 items, 
computed on the basis of an r to 2 trans- 
formation. Most other studies of the relations 
between attitudes and performance have re- 


an 


studies, 


are 


ported correlations based on aggregate scores 
on all the items composing a questionnaire 
or a segment thereof. Had this procedure 
been followed in the present study, one may 
be reasonably sure that the total job satisfac- 
tion score would have shown higher correla- 
tions with Quantity and Profitability than 
in this the items 
tended to have positive intercorrelations and 
would in 
reliable 


appear column, because 
aggregate have constituted a more 
instrument than did a single item. 


DISCUSSION 


The results of this investigation reveal 
generally positive relationships between em- 
ployee job satisfactions and performances, 


under the conditions studied. The question 


I 


SIGNIFICANTLY 


CORRELATED BEYOND THE .05 LEVEI 


MEAN 1r’s 


s = 47) 


f Iten 


Correlate 


is Signifi 


remains as to why, 
of the 
relationships. 

Part of the reason for the difference may, 
of course, in the previously noted 
methodological improvements, h the 


particularly since many 


earlier studies have not found such 


reside 


Suc as 


relatively large number of groups, compara- 


bility of work and work methods, commensu- 
rable and objective performance measures, 
etc. 
However, we do not believe that this is the 
We are inclined, rather, to 
interpret the results in terms of our previ- 
ously stated holds that the 
nature of the relationship between satisfaction 
and performance is dependent on the input 
conditions. The key to the conditions deter- 
mining the positive relationship in the present 
study lies in the situational data. These data 
show that the various groups of employees 
an urbanization dimen- 
sion, and that the degree of urbanization 
inversely associated both with satisfaction 
and performance. It is this last circumstance 
that may provide the clue to the positive 
correlations found between 
performance in this study. 
Let us therefore proceed to interpret in 
terms of the model the obtained relationships 
between the situational variables and each 
of these two sets of outputs. It will be re- 
called that the model holds that inputs may 


sole explanation. 


model, which 


studied differ along 


is 


satisfaction and 
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affect satisfaction or through 
their 


ability, or 


performance 
influence on employee motivation, 
both. The situational variables 
studied in this investigation were selected in 
the belief that they are relevant to employee 
motivations. It developed, somewhat 
pectedly, that the 
form a_ pattern 
urbanization of 
tional 


unex- 
situational variables 
expressing the 
the 
relevance of 


five 
degree of 
situation. The motiva 


differences along this 


dimension may be regarded as consisting of 
corresponding differences in culturally deter- 
mined needs f 


and expectations of the et 


ployees the kinds of differences that may be 
anticipated in comparing a work group com 
posed mostly of men who are residents of a 


large city, work in a large warehouse, an¢ 


} 
are unionized, with another group composed 


more of women, who are residents of a 
work in a 


town 
small warehouse, 
represented by a union 
Given a fairly 
ment in terms of perquisites 


ind are not 


uniform working environ- 


policies, and 
technology, as is the case among the ware- 
houses in this company, variations in 
may stem the 

fillment of employee needs 
expectations within this environ: 
note is 


faction from differential 
differing 
When 
the 


pward 


ent 
characteristi of 


ited 


taken of such 
job as its relative simplicity, lin 
mobility, and low pay rates compared the 
average for all industry, the inverse relation- 
ship between the extent to which groups show 
urban characteristics and their 
tion is understandable. This interpretation is 
supported by the finding that the 
groups are significantly less 
their pay, even though it is 
what higher than 
counterparts 

The 


urbanization 


iob satisfac 
urban 
satisfied with 
ictually some- 
that of their small town 
the 


performance is 


basis for relationship between 
I 


and less ap- 
Again 


our model would point to motivational dif 


parent from the information at hand 
ferences situations Phe model 
would therefore lead us to interpret the ob 
tained 
terms of 


among the 


findings regarding productivity in 
: 


differences in the adequacy wit 


) 
which productive behavior fulfills the need 
and expectations of employees in urban as 
compared with nonurban situations; with the 


perquisites and rewards of employment rela 


. Parker 


tively constant across divisions, differences in 
adequacy of fulfillment would be attributable 
again to differences in the culturally deter- 
mined needs and expectations of the em- 
ployees. For example, it seems reasonable that 
the more nonunionized, small town 
groups may be more likely to expect produc- 
tive behavior to lead to satisfaction of their 
particular needs 


female, 


for pay, status, and security 
than is the case in the urban groups; the more 
male, unionized, may instead 
seek to higher pay 
norms re- 


urban 
their 


groups 
for 
Sx cial 


productivity relevant to 


meet needs 


through organized efforts. 


lor 
may also be quite different 
among urban and nonurban groups; in this 


garding 


needs 
peer acceptance 


connection, it may be noted that instances of 
output restriction, such as those described by 
Whyte (1955), have typically been reported 
in settings bearing a closer resemblance to our 
urban than to our nonurban ones. By way of 
further illustration, the greater 
that the less urban groups experience toward 


satisfaction 


their supervisors could lead such groups to ex- 
pect more desirable consequences of perform 
with management’s norms fot 
high productivity. While speculative, these 
considerations do seem reasonable and to be 
in line with the information at our disposal 
The interpretation of the findings in terms 
of our theoretical 


ing in accord 


may therefore be 
Among the 72 


input 


model 
summarized as follows: 
this company, 
situational 


ware- 
houses of differences 


exist in variables affecting em- 
plovee needs and expectations. The perquisites 
and rewards of employment are essentially 
and therefore fulfill 
the needs of employees in some of the ware- 
than they do in 


others. This accounts for the significant cor- 


similar in all warehouses, 
houses more adequately 


relations obtained between situational (input) 
variables satisfaction 
Furthermore, 
exhibit 
performance, 


and job (output ) 


variables. having 
different 
productive 


employees 
different 
depending on 


levels of 
the 
given level of per- 
fulfillment of their 
This accounts for the significant correlations 
obtained 


needs 


perceived relevance of a 


formance to the needs 


between situational vari- 


per formance 


(input ) 


ables and (output) variables 
it would appear that the patterns of 


needs and expectations that are better satis- 


Finally 
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fied through employment in this work are also 
those which are more likely to be perceived 
by employees as fulfillment 
This cir- 
significantly 


capable of 
through productive performance. 


cumstance may explain the 


positive correlations obtained between the 


two sets of outputs, job satisfaction and 


performance. 
It should be noted that a postive relation- 
and performance 


ship between satisfaction 


would be expected to appear, in terms of 
this model, only under special circumstances 
The rela- 


absent were employment 


such as seem to have existed here 
tionship would be 
perquisites to vary with employee needs, were 
performance to be independent of employee 
needs (as in 
etc. It is of 
(195 


the case of paced production), 
interest to note that Worthy 
) has qualitatively described relation- 
ships ge nerally similar to those observed here 
among the urbanization, morale, and perform- 
ance of various units of Sears, Roebuck; a 
review of the nature of the retailing industry 
that it 


tor positively 


would likewise affords condi- 


suggest 


tions correlated 


performance 


and satisfaction varving with 


employee 
motivations. 
While we believe that our 


facts found in this study, 


model therefore 
accommodates the 


we recognize that it is by no means thereby 


proven. For one thing, the explanation intro- 


such as needs 


duces variables, expectations, 
have 


Furthermore, the 


which we 
, data. 
idence is based on correlations, which may 


fulfillments, on 


anty and 


and only 


indirect 


coincidences rather than caus lity. For 
example, company management has reason to 
that the 
profitable partly, at 


believe smaller plants are more 


+ 


least, because they are 


newer and because in smaller communities 


there is greater flexibility ot operation and a 
In this 


warehouse and community size may 


more compact trading area to serve. 
context, 
be more appropriately considered as control 


which profitability should be 


adjusted, than as independent variables. The 
this would call for the 


perception of the characteristics 


variables, for 
extension of logic 
situational 
is control variables, resulting in the con- 
clusion that there is no appre iable correlation 
between job satisfactions 


beyond what can be attributed to the relations 


and performances 


that they have in common with the control 
variables. 

However, for the 
inclined to adhere to the 
conception of the situational characteristics 


present, we are more 


aforementioned 


as inputs or independent variables. When this 
view,is taken, the results are not inconsistent 
with theory derived from previous evidence 
retention of the 


value of leading to 


Furthermore, the tentative 
has the 


testing of its adequacy in this situation 


further 
We 


for example, investigating employee 


model 


are now 
performance and satisfaction under supervi 
sors providing different degrees of considera- 
(Fleishman, 
with hypotheses holding that the rela- 


tion and initiation 


1957), 


structure 


tions between supervisory inputs and em- 


ployee outputs are functions of employee 


needs and expectations as reflected in our 


situational variables. The model also leads to 
efforts to improve the measures of en ployee 


needs and expectations, to the study of inputs 
that have sharper psychological content than 
our gross situational variables, and eventually 
j test 


outputs 


to experimental change in inputs to 


whether hypothesized chan; 


would indeed follow 


SUMMARY 


The objective of this study was to illumi- 


nate the relationship between employee job 


satisfaction and performance, through a re 


search design that conceived of these vari- 
ables as outputs of a system having as inputs 


Data 


perform- 


the characteristics of the work situation 
on employee iob satisfaction job 
and situational characteristics 


ance, were 


comparable, geographically 
decentralized warehousing divisions of a com- 
These data ] 


were intercorrelated ising 
the division as the unit of analysis 


obtained in 72 


pany 


findings include: 


l Job pertorn homoge 


ince 1s ! i 


characteristic. Measures of quantity 


duction per man-hour and profitability are 


intercorrelated, but quality of production and 


turnover are each essentially independent of 


the other performance measures 


+ 


Employer job satisfactions, as meas 


ured by questionnaire items, are significantly 





R. A. 


greater in those divisions which turn out the 
greater quantity of production per man-hour 
and which are more profitable. Job satisfac- 
tions are significantly associated neither with 
turnover nor quality of production, as meas- 
ured here. 

3. Five situational variables are intercorre- 
lated and may be represented by a general 
centroid factor characterized as urban vs. 
small town culture. These variables include 
community size, number of employees in the 
division, union representation, average wage 


rate, and proportion of employees who are 


male. 
4. Divisions situational character- 
istics are in the direction of the small town 
culture pattern typically have greater em- 
ployee job satisfaction and superior job per- 
formance (in terms of quantity of production 
and profitability); there is some trend for 
such divisions also to have lower rates of 
turnover. 
5. Their 


whose 


correlational nature makes the 


results amenable to more than one interpreta- 


tion. The one preferred by the investigators 
regards the situational characteristics as inde- 
pendent variables, with job satisfaction and 
performance as dependent variables which are 
correlated because each is a function of the 


Katzell, R. S. Barrett, and T. C. Parker 


same situational characteristics; employee 
needs and expectations are postulated as vari- 
ables intervening between the situational and 
both the satisfaction and performance vari- 
ables. 
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In a discussion of problems in the selection 
and classification of workers, Ghiselli and 
Brown (1955) point out that there are a 
number of possibilities for combining workers 
to obtain maximum effectiveness of a work 
team where the nature of the work requires 
cooperation among the members. In a labora- 
tory investigation of naturally occurring com- 
binations, Ghiselli and Lodahl (1958) found 
that the patterns of scores made by group 
members on scales derived from a self-descrip- 
tion inventory were related to group effective- 
ness in a task requiring close cooperation 
within the group. The self-description in- 
ventory (SDI) was one developed by Ghiselli 
(1954), and the scales used were Supervi- 
sory Abilities and Decision-Making-Approach 
(DMA). For both the Supervisory and DMA 
scales, it was found that the most effective 
groups were those in which only one member 
of the group obtained a high score and all 
others obtained uniformly low scores, thus 
resulting in a positively skewed distribution 
of scores. The mean scores obtained by the 
groups on each scale were not related to the 
productivity of the groups. Ghiselli and 
Lodahl concluded that the balance or pattern 
of scores within a group was more important 
than merely the amount of a psychometric 
characteristic possessed by a group. 

The purpose of the present study is to ex- 
plore the extent to which such distribution or 
pattern effects are related to performance in 
industrial work groups. Ghiselli’s SDI was 
used as the psychometric instrument, and 
from the Ghiselli and Lodahl (1958) results 
it was hypothesized that the mean level of 
Supervisory and DMA scores in a group 
would not be related to group performance, 


1 The authors wish to thank the management and 
employees of the United Airlines maintenance base in 
San Francisco for their cooperation in making this 
study possible. Particular gratitude is due L. T 
Jr., Hollis Williams, and Robert Daubenmiré 


Long, 


LYMAN W. PORTER? 


University of California, Berkeley 


but that the heterogeneity and skewness of 
group scores would show a positive relation to 
performance. Since the industrial groups had 
formally appointed leaders, the relation of the 
leader’s position in the distribution of his 
group’s scores to the group’s performance was 
also assessed. In particular, it was hypothe- 
sized that the higher the leader’s score rela- 
tive to his group on the Supervisory scale, the 
better his group would perform. 

In an effort to attain greater understanding 
of the relationship between psychometric 
score patterns and group performance, the 
social characteristics of the group were also 
examined. In particular, it was reasoned that 
group might be an important 
mediator between score patterns and group 
effectiveness, in that the psychometric com- 
position of the group might affect its cohesive- 
ness, which in turn might affect its productiv- 
ity. Likewise, it was hypothesized that the 
group leader’s popularity with his men might 
affect the relationship between his position in 
the group distribution of psychometric scores 
and the group’s productive effectiveness. 

In the industrial situation studied, the 
groups were not uniform in the degree to 
which cooperation or teamwork was necessary 
in the performance of their work. Therefore, 
an analysis was made of the effect of this 
variable on the strength of the relationships 
stated above. Specifically, it was hypothesized 
that the strongest relationships would occur 


cohesiveness 


in groups where the necessity for cooperation 
was highest. 

MeTHOD 
Subjects and Research Setting 


The population used in this study was 567 shop 
workers organized into 62 groups employed at the 
chief maintenance base of an They were from 
the power plant division which was responsible for 
engine overhaul, and the types 


iirline 


of operations carried 
out included tearing down and disassembling engines, 
cleaning and inspecting used parts, replacing worn? 
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parts, remachining some of the used parts, reassem- 
bling engines, and finally, testing completely re 
assembled engines. There were four major sections of 
the division, each under a general foreman; in each 
section there were three or four “work centers,” each 
under a foreman; in each work center were four or 
five basic work groups, called “lead groups,” each 
under a “leadman.” These lead groups varied in size 
from 4 to 13 men, with a mean of 9 

All personnel at the foreman level and above were 
considered members of management and were not 
included in this study; leadmen and mechanics (the 
airline’s term for any nonsupervisory shop worker, 
regardless of specific duties) were members of a 
machinist’s union. Although the leadman was a 
member of the union, he functioned as a “straw 
being in effect the first level of supervision and 
having direct responsibility for the mechanics under 
him. To become a leadman, a mechanic must be with 
the company a minimum of 10 years, must pass a 
job knowledge test, and must have the 
seniority among those eligible for the job. Leadmen, 
thus, were mechanics with high seniority who were 
familiar with the plant’s operations. Despite the fact 
that they nonmanagement, they were judged 
by the management of the power plant to have con- 
siderable influence over the performance and effec- 
tiveness of the men under them. 

The amount of cooperation required of the mem- 
bers of a lead group varied with the nature of the 
duties of the groups, the men 
worked cooperatively as a team; for other groups 
some teamwork and some individual work was re- 
quired. In all groups, however, the men worked in 
close proximity with each other, and had frequent 
contact with each other and with their leadman 


boss,”’ 


greatest 


were 


group. For some 


Procedure 


The men were tested in groups of 50 to 60. When 
they particular testing session they 
were given a brief introductory explanation about 
the study j told they would be asked to 
“fill out individually a brief self-description form 
which will give us 


arrived for a 
They were 


a picture of the traits a person 
believes he possesses and allow us to see how each 
person himself. This is mot a test. There 
are no ‘right’ or ‘wrong’ answers on the form, and it 
will be of use to us only if each 
honestly and accurately 


forms will be 


describes 


man answers as 
The completed 
completely confidential, and under no 
circumstances will anybody in the 
of the material.” 
Following these instructions and before the 
proceeded to fill out the SDI, the 
one other task: “write down the names of any men 
in your lead group whom you would prefer as work 
teammates. Each of you is asked to try to list at 


as possible 
company have 
access to an\ 
men 


were asked to do 


least one person, but no more than five.” The con- 
fidential nature of the data was stressed again, and 
the men then were set to work completing their 
choices and filling out the SDI. The total 
time per person to complete these two tasks averaged 


4 


about 25 minutes. 


sociometri¢ 


Measurement of Psychometric Variables 


Ghiselli’s Self-Description Inventory (Ghiselli, 
1954) served as the psychometric instrument in this 
study. Two scales from this inventory were scored: 
Supervisory Abilities and Decision-Making-Ap- 
proach (DMA). These two scales were used in the 
Ghiselli and Lodahl study (1958) and are presumed 
to have some relation to the organization and control 
of group effort. The Supervisory scale is constructed 
of items that differentiate between the self-descrip- 
tions of individuals thought adequate for supervisory 
responsibilities and individuals considered inadequate 
for such positions (Ghiselli 1954). The DMA scale is 
items that discriminate between the 
self-descriptions of top- and middle-management 
personnel, and which seem to describe how individ 
uals in these groups approach the decision-making 
process (Ghiselli & Lodahl, 1958; Porter & Ghiselli, 
1957). This latter scale probably measures more than 
merely the type of approach to decision-making, but 
does appear to reflect qualities which relate to group 
functioning 


composed of 


To test the hypotheses outlined in the introduc- 
tion, indices for each lead group were calculated for 
both scales: (a) the arithmetic mean; (6) heteroge 
neity, measured by the standard deviation of scores; 
(c) skewness, calculated by the formula 


_{M — Md 
5 
(—-) 


(this formula for computing skewness differs from 
that used by Ghiselli and Lodahl, 1958, because their 
method was not appropriate for the 
groups used in this study); (d) the leadman’s score 
position in his group, measured by 


larger-sized 


computing his 
percentile score within his own group’s score distri 
bution. For Indices a, b, and c, 
score was excluded from the calculations 


the leadman’s own 


Measurement of Sociometric Variables 


The method used to measure cohesiveness in this 
Proctor and Loomis 
The basic unit of this measure is a reciprocal 
(RC) within a group. A reciprocal choice is 
one in which Person A chooses Person B, and B also 
chooses A. It is that groups with more 
reciprocal choices in proportion to their size are more 
than others. In 
cohesiveness, RC/RCmax, 
variations in the size of groups (m) and to take into 
account the number of choices allowed (k). Proctor 
and Loomis give the formula for RCmax as mk 
the result where » and k are both odd numbers is a 
fractional number of reciprocal which is 
impossible; RCmax was therefore taken to the next 
smaller integer in this study 

Another index 
data was the popularity of the leadman. This 
obtained by dividing the choices group 
members gave the leadman by the maximum possible 
number of such choices. 


study is one 
1951) 


{ hoic e 


suggested by 


reasoned 


constructing an index of 
it is necessary to adjust for 


( ohesiv e 


choices, 


sociometric these 


was 


computed from 


number. of 
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from the 
men in 
preter as 


derived 


These 


question, 


sociometric indices 
“Write down the 
your lead group whom 
work teammates.” Thus, 
were oriented toward work relationships instead of 


were 


names OF any 


you would your 


these sociometric choices 


purely social ones 


Measurement of Necessity for Group 
Cooperation 


Ratings were used to measure the degree to which 


group cooperation was necessary Supervisory per 


sonnel familiar with the work of the groups ma 


these ratings, and it was possible to have each g 


rated by two supervisors. The ratings were mad 


six-point scale, with instructions being to rate 
< together, and 


the technical necessity to wor 


' 
| 
the degree to which the groups actuall 


operated. Each supervisor’s ratings 


regard 
standar 
on his own mean, and the average of the 
taken as the 


operation for each group. The 


measure of the necessity for 


interrater correlatior 


was 

between .20 and 
low, this is miti 

that the final d 


coefficient for these data ranged 


65. Although these coefficients areé 
gated to some extent by the fact 
were means, each based on two 
Also, recent 
suggests that “high 


sarily impl 


used 


ratings evidence published 
1959) 


not nec¢ 


nterrater 

rating 
lictability” 
Buckner suggests that one reason for this i 
aspects oi per 


predicta le 


some cases indicate a lack of pre 
that 
different raters may e! 1 nt 
thus leading to 


higher 


agreement imong 
since the total 


r. Such a phe 


formance, lower 


raters but a over 


san ple of behaviors observed 


nomenon probably occurred ratings of neces 
sity for cooperation in the present study. Most of the 
groups used have a variety of activities, and 

the four rating made 
foreman 

these 


different 


men each viewed the 


vantage 


I 
points base 


the groups’ behavior. It 


with the use of these ratings 


pre ented above 


Veasurement of Productivity 


function of the power plant di 
repair of airplanes, it 


and 

units 

reasonably 
} 


standards d 


yssible to obtain productivity data in 
ut. Instead, in 


productivity measure, time 


order to obt a 


The company had computed time standa 


most of the jol iccount 


»s in the power plant for cost 


ing purposes, and had data available showing thi 
monthly 


time standards 


ill monthly 


over-all percentage 


achieved by each le id group : I se over 


were used as 


Because 


figures 
ro a) 
grou] 
from one 
differences in job productivity pet 


standardized on the 


centage for eacl group 


mean work center to which it 
belonged 
served as 


in computing correlations with other variables 


percentage for the 
These work indardized 
the productivity data for the 


center st scores 
lead groups 
being 
based on the 
immediately 
for the study. The 
corrected by the 


from the 


studied, and 
for the 3 months 
tion of the 
coefficient 
and 
the productivity data for the 
study was .78. This wa 
reliability, 


were not used r yurposes 


were average percentagt 
yllec- 
reliability 


Spearman-Brown for 


preceding the « 


other data 


bet weer 
months preceding the 


considered 


mula determined correlation 


Satustact 
: 
standards in this 


time 1 
Also, the 


criterion 
shop 


nature ot 


ince 


aircrait engine 
and 


strong 


hasis on quality 
secondary but 

ler these condition 
oming work will natu 
the overhaul and 


performed. These factors 


the criterion reliability is 


RESULTS 


Product-moment correlation coefficients 
were used to evaluate the hypothesized rela- 
Table 1. The 
sample for these computations was 

Data 


dis irded 


tionships. These are shown in 


size of the 
from 7 of the 


55 groups. 


groups were because « 


members of these groups was ill or on \ 
Table 1 


shows that as expected, average level of group 


tion at the time of data collection 
scores was not related to produc tivitv in these 


DMA 


the standard deviation of 


groups scores 


Heterogeneity of 
measured by 


scores was unrelated to performance: 


heterogeneity of Supervisory scores Was 


tively related to grou] pertormance 


relationship is opposite the direction 





76 Thomas M. Lodahl and Lyman W. Porter 


pothesized. Skewness of group scores was not 
significantly related to productivity for either 
scale. Both of these coefficients were negative, 
however, again in the opposite direction from 
that hypothesized. The leadman’s score posi- 
tion as measured by his percentile position in 
the group on the Supervisory scale was also 
negatively related to group productivity. For 
this relationship, the possibility existed that 
the leadman’s divergence from his group’s 
average was the significant factor in group 
productivity rather than his actual position. 
Examination of the scatterplot did not reveal 
this to be the case: no curvilinearity was ap- 
parent in the scatterplot of leadman’s per- 
centile versus productivity. 

In summary, group score patterns on the 
DMA scale showed no significant relation to 
group performance in this study. With the 
Supervisory Abilities scale, heterogeneity of 
group scores, and leadman’s percentile score 
position in the group were related to produc- 
tivity. Both of these relationships were nega- 
tive, opposite to the direction hypothesized. 

To assess the possible mediating role of 
group social characteristics, the interrelation- 
ships of Supervisory scale score patterns, 
sociometric variables, and productivity were 
calculated using product-moment coefficients. 
These are presented in Table 2. It can be seen 
in Table 2 that cohesiveness tends to be nega- 
tively related to Heterogeneity of Supervisory 
scores and positively related to Productivity. 
The index of leader popularity likewise tends 
to be negatively related to the Leadman’s 
percentile position and positively related to 
Productivity. These results indicate limited 


TABLE 2 


INTERRELATIONSHIPS AMONG SUPERVISORY S« 


PATTERNS, SOCIOMET! 


VARIABLI 


PRODUCTIVITY 


TABLE 3 
BETWEEN PREDICTOR 
PRODUCTIVITY OF 


CORRELATIONS VARIABLES AND 
Groups CLASSIFIED BY DEGREI 
OF NECESSITY FOR COOPERATION 


MEMBERS 


AMONG GROUP 


y for Cooperation 


Medium High 


.O1 
>.42 


12 
4 


05 

Ol, 
support for the idea that score patterns exert 
part of their influence on group effectiveness 
through their relation to the social character- 
istics of the group. 

The influence of the technical necessity for 
cooperation among group members on the re- 
lationships of score patterns and social char- 
acteristics to Productivity were examined by 
breaking down the total sample of groups 
roughly into thirds by the degree of rated 
necessity for cooperation and computing cor- 
relations separately for each subsample. The 
results of this analysis are shown in Table 3. 
(Numbers of groups shown are smaller than 
in preceding tables because ratings of neces- 
sity for cooperation were not available for all 
groups.) In this table, all of the significant 
relationships appear in groups in which neces- 
sity for cooperation is high. One predictor, 
Leadman’s percentile position, showed signifi- 
cant relationships in preceding analyses but 
failed to reach significance in this analysis, 
although its correlations are moderately high 
and consistently negative over all degrees of 
necessity for cooperation. 


DISCUSSION 

The results obtained in this investigation 
show that for the total sample of industrial 
work groups studied, patterns of group scores 
on the DMA scale of Ghiselli’s SDI were un- 
related to group productivity, but certain pat- 
terns on the Supervisory scale were related to 
group performance. These Supervisory score 
patterns also tended to be related to group 
cohesiveness and to the leadman’s popularity; 
these were in turn related to Productivity in 
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such a way as to suggest that part of the ob- 
tained relation of score patterns to Productiv- 
ity is mediated by the social characteristics of 
the group. Taken in sum, the results indicate 
that psychometric pattern indices in this in- 
dustrial study are not related to group per- 
formance in any simple manner predictable 
from the laboratory study. The findings seem 
to point instead to the existence of a complex 
interaction among psychometric patterns, so- 
cial factors, and group productivity. While 
the obtained correlations are low and run in 
unexpected directions, it seems useful to ex- 
plore possible bases for these relationships, 
utilizing the data on social characteristics and 
necessity for cooperation. 

As a whole, these findings can best be 
understood if the nature of the leadman’s 
position is taken into account. He is nom- 
inally in charge of his group, but is not offi- 
cially a member of management; in fact, he 
is a union member, as are his group members. 
Members of management of the shop stated 
in interviews that the direct 
responsibility for the performance of his 
group and that he does in fact exercise great 
influence on group performance. Yet in this 
shop the leadman is given little if any formal 
power over the group in terms of manage- 
ment-authorized sanctions such as power to 
hire, discharge, promote, etc. Lacking these 
formal means of control, it that the 
leadman must resort to informal, interper- 
sonal ways of influencing his group. 

It makes then that those leadmen 
who are sociometrically popular with group 


leadman has 


seems 


sense 


members would be able to exercise greater 
influence their behavior. A leadman 
whose Supervisory score is high relative to his 
group is likely to be less popular with his men 
(see Table 2), and thus not have available 
this means of control. This be a 
situation in which a trait (supervisory abil- 
ity) which would ordinarily be an asset to the 
leader becomes a liability, given the means of 
influence available to him. Here, if the leader 
differs too much from his men on the trait, he 
tends to lose his popularity and thus his in- 
fluence. In this connection it is interesting 
that the only strong relation between leader 
popularity and group productivity occurs in 
groups in which a high degree of cooperation 


over 


seems to 


among members is necessary. The 
leadership employed by the highly chosen 


leader seems to be a significant influence only 


type of 


in situations where coordination of the efforts 
of individuals is important in determining 
group output. 

Heterogeneity and group cohesiveness be- 
gin to fit in if we suppose that a highly cohe- 
sive group has a greater ability to govern 
itself than a cohesive one; control of 
members’ behavior by the group is possible. 


less 


The leadman’s problem in this case is to guide 
this force in directions beneficial to the total 
organization, while not losing his own influ- 
ence with the group. Greater heterogeneity of 
group supervisory 
ciated with low cohesiveness, 


scores generally is asso- 
which in turn is 
associated with low productivity. 

These relationships can be understood when 
the patterns of 


groups are 


choice 


examined. 


in highly cohesive 
It appeared that those 
groups scoring high on the cohesiveness meas- 
ure had for the part their 
choices toward the leadman, who had in turn 
Statis- 
tically, this was tested as the relation between 
leader popularity and group cohesiveness: the 


most oriented 


reciprocated most of these choices. 


correlation coefficient was .42, significant be- 
yond the .01 level. It may be that this social 
pattern provides one of the bases for the lead- 
man’s informal influence, in that in the 
sive groups he may function as a social 


cohe- 


+ 


Rat 


e- 
keeper,” rewarding by social choice those who 
contribute and punishing those who do not by 
rejection. If he controls the social pattern to 
this extent, this could also give the means of 
guiding the 


group cohesiveness to benefit the 


power for control implied by 


total organ- 


ization. This also helps explain why cohesive- 


ness is positively related to productivity, if 
we assume most leadmen hold goals of high 
productivity. Again here it is important to 
note that the only significant rela 

occur in groups where a high degree of co- 
operation is necessary. Apparently on rela- 
tively independent jobs group psychometric 


itionships 


and social characteristics are less important in 
determining productivity. 

To some extent these findings parallel those 
of the Michigan studies of productivity, 
supervision, and morale (Kahn, 1956; Sea- 
shore, 1954). The earlier studies, reviewed by 
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Kahn, indicated that there is more relation- 
ship between supervisory attitudes and pro- 
ductivity than between group attitudes and 
productivity. Although the leadmen in the 
present study are not clearly supervisors, the 
importance of their role is indicated by the 
relationships between leadman’s popularity 
and group productivity. In Seashore’s investi- 
gation of group cohesiveness and productivity, 
it was found that highly cohesive groups had 
either high or low productivity, and that the 
direction of this deviation was a function of 
perceived supportiveness of the larger organ- 
ization. In the present study, groups in which 
necessity for cooperation was high showed a 
positive relationship between cohesiveness and 
productivity. Although no attitude data were 
collected, the general atmosphere in the com- 
pany indicated that it is likely that most of 
the groups have relatively strong positive atti- 
tudes toward the company and to the plant 
management. In this case it seems reasonable 
that their productivity will be influenced by 
their attitudes toward the leadman, and by 
the leadman’s effectiveness at guiding the 
group through informal means of control. 
Finally, Seashore failed to find a hypothesized 
relationship between group “similarity” on 
age and education and group cohesiveness; in 
this study, heterogeneity of group Supervi- 
sory related to cohesiveness. In 
summary, the present results do not conflict 
with Seashore’s findings but rather they show 


scores Was 


that there are other variables important in 
determining the relationship between cohe- 
siveness and productivity, and that the rela- 


tionships are more complex than previously 
stated. 


CONCLUSIONS 


The results obtained in this industrial field 
study suggest that as yet no firm statement 
can be made as to the relationship between 
patterns of psychometric work 
groups and group productivity. However, this 
study has clearly demonstrated that essen- 


scores in 


tially social variables, such as group cohesive- 
ness and leader popularity, are important in 
attaining an understanding of the nature of 
the relationships between score patterns and 
productivity. Furthermore, it is also clear 
from these results that the strongest effects of 


score patterns on productivity will be found 
in groups where necessity for cooperation is 
high. Finally, on the basis of reasoning pre- 
sented in the section, it seems 
likely that some consideration must be made 
of the managerial situation of the group 
leader (in terms of sanctions available to 
him) if a clear understanding of this over-all 
problem is to be attained. 


discussion 


SUMMARY 


This study was concerned with the idea 
that group productivity in industrial work 
groups related to the patterns of 
psychometric scores formed by combining in- 
dividuals groups, that such 
patterns part of their influence 
through affecting social characteristics of the 


may be 


into and score 


may exert 
group. Data were obtained from members of 
55 industrial work groups on Ghiselli’s Self- 
Description and a_ sociometric 
questionnaire. For the Supervisory Abilities 
scale of the Self-Description Inventory, het- 


Inventory 


erogeneity of group scores as measured by 
the standard was nega- 
tively related to productivity, as was the 


deviation of scores 
leader’s percentile score position within his 
own group. These variables were also nega- 
tively related to group cohesiveness and the 
leader’s sociometric popularity with his men; 
cohesiveness and leader popularity were in 
turn positively related to productivity. Using 
ratings of “the necessity for group coopera- 
tion” in performing the group task, it was 
found that the strongest relationships between 
the predictor variables and productivity were 
in groups where necessity for cooperation was 
high. It was concluded that patterns of psy- 
chometric work 
may bear some relation to group productivity, 
but this relation is affected by social char- 
acteristics of the group and the relation of the 
group to the leader. These score pattern ef- 
fects and social influences on productivity are 


scores in industrial groups 


strongest in groups where the work situation 
requires a high degree of cooperation among 
group members. 
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PSYCHOLOGICAL VERSUS SOCIOLOGICAL VARIABLES 
IN STUDIES OF VOLUNTEER BIAS IN SURVEYS 


C. R. BELL! 


London School of Economics 


Investigators in the fields of opinion, mar- 
ket, and social survey research must consider 
the effects of basing their analyses on data 
obtained from less than 100% of the popula- 
tion sample chosen for study. The validity of 
extrapolating results of these analyses to the 
not-reached part of the population must also 
be examined. There is little evidence to deny 
that volunteers differ (other than in their 
volunteering behavior) from others whose at- 
titudes, preferences, opinions, and behavior 
have not been studied in the survey. Unfortu- 
nately it is also undeniable that there is little 
conclusive evidence of the ways in which they 
do differ. 

Some investigators have attempted a statis- 
tical analysis of the effects of inadequate re- 
turns to mail survey enquiries and have tried 
to estimate the bias arising from basing re- 
sults on early and incomplete returns. No- 
where, however, has there been attempted a 
systematic description of the volunteer or a 
formulation of rules to aid future research 
workers who have to deal with the problem of 
volunteer bias in fields ot social investigation 
which are relatively unexplored. Most pub- 
lished reports have been concerned with the 
effects of the use of volunteer subjects in one 
specific situation at one particular time. A 
glance through Table 1 below, suggests that 
the variables studied have been diverse, 
though the list is far from describing all 
aspects of behavior which may be relevant to 
the question of volunteering and survey bias. 
With one or two exceptions each entry in the 
list of psychological variables derives from 
one source only. Evidence of a variable’s uni- 
versal relevance to all volunteering situations 
or even confirmation of its relevance in one 
specific situation at different times is notably 
lacking. 


1 Present address 


Medical Research Council Cli- 
Efficiency Unit, Department of 
South Parks Road, Oxford, Eng- 


mate and Working 
Human Anatomy, 
land. 


and Political Science 


The present paper seeks to draw together 
and examine the findings of the various em- 
pirical studies of volunteer bias in order that 
some assessment might be made of the ade- 
quacy of the information at present available. 
This data source has been supplemented by 
the inclusion of hypotheses published by re- 
search workers who have based them on ex- 
perience and direct observation of volunteer 
subjects. 

PROBLEM 


When the percentage of the sample not 
providing data is small, it may sometimes be 
justifiably assumed that its exclusion from the 
analyses will have an almost negligible effect 
upon the final conclusions of the survey. In 
general, however, it may be that serious bias 
and distortion in survey results become more 
likely .as the proportion of the sample not 
reached becomes greater. 

In studies of the population of Great 
Britain there may be a loss of 5% or more 
(Gray & Corlett, 1950; Moser, 1949) due to 
prolonged illness, absence from home, move 
to another district, or death. In addition to 
these losses from the sample, there is also a 
loss of those who are reached but who ex- 
plicitly refuse to cooperate. With a _ well- 
designed survey by interview or mail ques- 
tionnaire, with well-trained interviewers, and 
adequately piloted question schedule, with 
real incentives, the number of refusers may be 
kept to a minimum. Even with several follow- 
up (call-back) stages, however, a large pro- 
portion of the percentage lost may be due to 
nonavailability of the persons chosen for 
study. The hard-to-reach individuals are 
often lost because the survey administrator 
cannot afford the time and extra expense in- 
volved in pursuing them to contact (Hilgard 
& Payne, 1944; Lundberg & Larsen, 1949). 

It is not rare to find reports of surveys 
(Clausen & Ford, 1947; Norman, 1948) in 
which less than one-half of those originally 
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chosen for study have provided data for anal- 
ysis. If the research worker is to establish 
control and weighting systems, then it is 
essential that the effects of incomplete cov- 
erage of the survey sample population be 
investigated and understood. Basically, the 
interest of most research workers is not in 
obtaining a picture of all the ways in which 
those who provide data differ from those who 
do not, but rather their concern is with those 
differences which have a particular relevance 
to the topic they are studying. It is difficult 
to predict, a priori, what these will be in any 
given investigation. Research workers may, in 
consequence, try to achieve some degree of 
control by examining the data-providing (vol- 
unteer) part of the sample and weighting the 
distributions of those variables for which 
there is information available for the parent 
population (Clausen & Ford, 1947; Ford & 
Zeisal, 1949; Rollins, 1940; Shuttleworth, 
1941; Stanton, 1939; Zimmer, 1956). But 
attempts at securing “representativeness” by 
weighting scales and other correction devices, 
such as taking early versus late respondent 
differences as a guide to replier versus non- 


replier differences, may be invalid unless it 
can be shown that weighting is in terms of 
“characteristics relevant to the study” (Fer- 
ber, 1949). 

Despite warnings that sample stratification 


“on the basis of objective indices alone 
[which are] largely sociological in nature 
may not be sufficient” (Maslow & Sakoda, 
1952), many survey researchers still seem to 
assume that a sample which reflects the par- 
ent population’s distributions of age, sex, oc- 
cupational level, social class, and household 
composition, invariably provides a sound 
basis for generalization of the survey findings 
to the general population. Even with an 
optimistic assessment of achieving complete 
representativeness in terms of just these vari- 
ables (Crossley, 1941; Gray & Corlett, 1950; 
Moser, 1949, 1955), the problem remains of 
assessing the relevance of each one of them to 
the attitudes, beliefs, preferences, and opin- 
ions in which the researcher is interested. 


Sociological Variables 


Attempts have been made to identify char- 
acters which are coincidental with some forms 


of volunteering. There have been investiga- 
tions (e.g., Kruglov & Davidson, 1953; Rosen, 
1951; Siegman, 1956) to discover basic per- 
sonality traits associated with the willingness 
and desire to volunteer, and studies (e.g., 
Crossley & Fink, 1951; Hilgard & Payne, 
1944; Reuss, 1943) of the secondary char- 
acteristics, usually sociological, of the section 
of the population who volunteer. Unfortu- 
nately the latter studies have been, in the 
main, confined to post hoc investigations 
which have limited application beyond the 
situation which gave rise to them. Variables 
studied and conclusions drawn are rarely the 
same in any two reports. this 
specificity, it is difficult to draw many con- 
clusions about general volunteer character- 
istics of a sociological nature. The information 
presents an ambiguous picture. In certain 
studies some variables are shown to be asso- 
1 with bias and in others they are shown 
to be not associated. Some of the variables 
cited may be factors relevant to physical 
availability at the time of the survey. 

In considering the apparently quite differ- 
ent forms of behavior which are included 
under the concept “volunteer,” the difficulty 
of establishing universal sociological variables 
becomes understandable. Homogeneity in any- 
thing but the word “volunteer” is not easily 
seen in: (a) volunteers who accept and keep 
active membership of a listening, viewing, or 
household budget panel; (0) 
answer questions put to them by a charming 
person who interviews them in the home or 
in the street; 
homes 


Because of 


ciatec 


volunteers who 


(c) volunteers who leave their 
depth 
taste-testing sessions, group discussions, of 
program previews; and (d 
complete and return mail questionnaires 
Apart from reviews (Clausen & Ford, 1947; 
Norman, 1948) of studies in volunteer (re- 
spondent) bias in mail-questionnaire investi- 
gations, there has been no attempt to examine 
and compare systematically, the nature of 
volunteering in all these situations. 

A further cause of confusion in the ap- 
praisal of studies using sociological variables 


to participate in interviews, 


volunteers who 


is that investigators, it appears, have not dis- 
tinguished between variables relevant to the 
behavioral predisposition to volunteer and 


variables associated with a person’s avail- 
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TABLE 1 
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PSYCHOLOGI( 


+ ] } ] 
to describ 


Terms used eer 
Better adjusted 

More articulate 

Less authoritarian 

Greater candor 

Less conventional 


More ¢ 


Curious 


ynscientious 


Less defensive 

Greater democratic 

More drive 

Ego-satisfied status 

Less ethnocentric 

Favorable attitude to Negroes 
Flexible in interpersonal relations 
Greater frankness 


Habits of promptness 


Volunteering 
Age 
Church going ar 


Ethnic 


qa reigio 
background 
Home ownership 


Household comp 
Marital status 


Oc ul ational status 


OTHER 


Method of delivery 
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Respondent attitud 
Anonymity, task 
For face-to-face re 


rects for 1 
ests ft 


ttractiveness of alternate sit 


of the 


iatior 
Observatior 


reactions of others 


public conditions of vol In 


Private or 


us-request intensity 


ability at the time of the survey (i.e., when 
the interviewer calls; after the receipt of the 
mail questionnaire; at the date and time of 
the group session). In Table 1 it may be 
seen that some variables: religious affiliation 
(Wallin, 1949a, 1949b), ethnic background 
(Kruglov & Davidson, 1953; Pan, 1951), and 


Assoc 


Al 


VAR 


ATION WITH VOLUNTEER BIAS 


VARIABLES 


More i 


Less 


csome 
nervous 
Greater optimism 
More poise 

More polite 

Less power pre 
Less rigid 
More self-assurance 
More 

Higher self-esteem 
More sexually 


Less stereotype: 


self-discipline 


active 


lin tl} 
iin thir 


t I Vy to pro} 


Less tende ectior 


Greater tolerance of others 


VARIABLES 


ABLES 


urban or rural family background (Kruglov 
& Davidson, 1953; Reuss, 1943; Wallin, 
1949a), associated with the volun- 
teering syndrome. Others, for example: tele- 
phone ownership (Wallace, 1954), marital 
status (Wallin, 1949a; Zimmer, 1956), and 
household composition (Hilgard & 


may be 


Payne, 
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have a closer association to 


“contactability” or “time-to-spare- 


1944), may 
factors of 
ability.” Age, frequently examined as either a 
psychological or sociological variable, seems 
often to be significant in relation to the time 
at the 
rather than in relation to a desire to volun- 


spent home of various age 


groups 
teer or avoid volunteering. Thus young moth- 
ers and elderly retired persons may decep- 
tively appear to be “‘volunteers” when merely 
they are ,those who are rarely not-at-home 
when the interviewer calls 

(Belson, 


identifi- 


A purely quantitative approach 
1959: Clausen & Ford, 1947) to 
cation of variables associated with volunteer 


the 


bias in surveys may tend to mask a distinc- 
tion 
availability. Weighting on the basis of numer- 
ical little the 
understanding of sources of bias specifically 
the 
Consideration of the ways in which volunteer 


between factors of volunteering and 


analysis alone contributes to 


arising from use of volunteer subjects. 
bias operates is essential to the generalization 
of 


to another. 


from volunteering situation 
If the of 


sample stratification and weighting devices is 


findings one 


relevance variables in 


not understood, pseudo-corrections may be 


the ilts 
analysis than in the original data. 


producing more distortion in of 


. 1 
rest 


P vt hologi al Variabli y 


The search for basic or universal characters 
the volunteer be 
and understood has been hardly more 
It has led to the use of 
tories and personality tests 
of 
PEC 


1953): 


desc ribed 
fruitful 
psychological inven- 


by which might 


Volunteers have 


terms their on: 


Si ale 


examined in 
the Berkeley F Scale 
(Kruglov & Davidson, 
Multiphasic 
1951); the 
(Himelstein, 


been scores 
Scale, E 
the Minnesota 
Personality Inventory (Rosen, 
Taylor Manifest Anxiety Scale 
1956: Siegman, 1956); and 
aggression-conventionality-reserve, 
(Wallin, 1949a): de- 
rigidity (Siegman, 1956); self 
(Maslow & Sakoda, 1952; Siegman, 
and participation 
Lundberg & Larsen, 1949). 
fortunate that all studies 
(Lundberg & Larsen, 1949) 


not drawn from a representative section of the 


scales of 
conservatism-liberalism 
fensiveness, 
esteen 
(Gough, 


It 


except 


1956) social 
1952: is un- 
in one 


volunteers were 


in Surveys 


it should be 
noted that the situations for which volunteers 


general population. In addition 


were requested were not of the kind usually 
in the field of market 

These considerations 
the generality of the findings unless it can be 
that 
same in all subsections of the 


found and survey re- 


search. severely limit 


shown volunteering motivations are the 


reneral popula- 


tion and in all situations for which volunteers 


this 
proposition seems unlikely to be supportable 


are requested. On the evidence to hand 


In most of the studies using personality tests 
it is difficult to translate findings expressed in 


test 


jargon into language which is more di- 


rectly appropriate to those working in situa- 
tions outside the context and orientation of a 
particular psychological laboratory. It is, for 
to 


marke ting or 


example, not easy convert into behavioral 


terms of, Say, 
ences such descript ons of 
one having ‘“‘a greater 

(Gough, 1952). a higher 
potential (Kruglov & Davidsor 
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less tendency proie ion’ 


1951). 
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1953 
(Rosen, 
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sonality and 
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immediate needs the 


field to whom the problem of 
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These variables appear to be 
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Other Variables 


A set of factors related to response rates 
or volunteer bias are those concerned with the 
nature of the invitation, the magnitude of the 
effort demanded of the volunteer, and the 
administration and technique of the survey. 
Studies dealing with response rates to mail 
questionnaires (Clausen & Ford, 1947; 
Mitchell, 1939; Norman, 1948; Wallace, 
1954) suggest the relevance of questionnaire 
length and design; the mode of delivery; the 
timing of follow-up letters; and the respond- 
ent’s task, reward, and anonymity. Studies of 
response to face-to-face interview requests for 
volunteers indicate the relevance of the in- 
tensity of the request (Rosenbaum, 1956), 
the relative attractiveness of alternate behav- 
ior (Blake, Berkowitz, Bellamy, & Mouton, 
1956), and the way in which others are seen 
to behave towards a similar request (Blake 
et al., 1956; Rosenbaum, 1956; Rosenbaum 
& Blake, 1955). Whilst these factors may be 
seen as relevant to the manipulation of the 
willingness to volunteer, they would seem to 
be more appropriate to considerations of the 
design and execution of the survey rather 
than to an understanding of volunteering as 
such. They may be conveniently grouped with 
such factors as the design of the questionnaire 
and interviewing schedule, the training of 
interviewers, and the conduct of 
call-back system. 


an effective 


DIscUSSION 

The most obvious deficiency in the litera- 
ture is a report of an integrated, systemat- 
ically planned investigation of the subject of 
volunteer bias for its own sake—stripped of 
the specific demands of commercial interests 
in a particular field project and going beyond 
the bounds of particular laboratory orienta- 
tions. Such an investigation would have first 
to define clearly the term “volunteer.” At 
present it appears to include widely diverse 
behavioral phenomena. On the basis of the 
data available it is not possible to come to a 
final decision about the reality of this di- 
versity in volunteering behavior. 

Perhaps some of the difficulty which has 
been found in attempts to describe the nature 
of volunteering is the possibility that on 


Bell 


many occasions what has been taken to be 
purposeful wish to volunteer has been no 
more than availability at the time of the sur- 
vey and a lack of awareness of, or strong 
feelings about, being used as a subject. The 
characteristics of this type of so-called volun- 
teer may differ quite markedly from the 
characteristics associated with an active will- 
ingness or desire to volunteer. Although ques- 
tions of individual motivations to volunteer 
would seem to be dn important part of any 
consideration of the subject, little information 
or even suggestion is to be found. Greenberg 
(1956) has suggested that people who volun- 
teer do so because they are either too polite 
to refuse, or curious, or lonesome. 

Studies in which psychological tests have 
been used have underlined the laboratory 
worker’s ineptitude in communication with 
others who do not share the particular orien- 
tation he has to questions of personality and 
the tests used to investigate it. The list of 
sociological variables produces an incomplete 
and ambiguous picture. The reasons why such 
a variable is relevant as, for instance, re- 
ligious affiliation and church-going (Reuss, 
1943; Rosen, 1951; Wallin, 1949a, 1949b) 
should be understood before that variable is 
included in a weighting device aimed at a 
correction of volunteer bias. A whole complex 
of similar variables, identified as having some 
association with survey response through sta- 
tistical analyses of returns, gives no guarantee 
of correction of bias in the information 
gathered unless there is some understanding 
of the relationship between these variables 
and the kind of information given. “Blind” 
weighting may aggravate bias already present 
or may introduce new sources of bias unrecog- 
nized by the survey investigator. 


CONCLUSIONS 


The problem of the effects of incomplete 
coverage of a sample survey population is a 
real and present one for applied psychologists 
in market, opinion, and social survey research. 
Variables which have been identified in at- 
tempts to discover the nature of volunteer 
bias provide many hypotheses and almost no 
conclusions. Before adequate sample stratifi- 
cation schemes and correction devices can be 
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drawn up it is necessary for the apparent 

heterogeneity of volunteering behavior to be 

examined. There is also a need to clarify the 

distinction between a willingness (desire) to 

volunteer and a physical availability together 

with the inability to refuse to cooperate. It 

seems that neither a purely psychological, nor 

purely sociological, nor purely statistical ap- 

proach to the study of volunteer bias is ade- 

quate to deal with the problem. This appears 

to be one subject in which an interdiscipline, 

laboratory and field, investigation would be 

most useful. 
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of particular classes of geo- 
metrical for training on dial-reading 
tasks seems to be relatively unexplored. Al- 
though psychological studies of dial design 
variables have a history of approximately 20 
years and the resulting literature is volumi- 
nous, little emphasis has been directed toward 


The utility 


cues 


changes occurring in the usefulness of a given 
perceptual cue structure during a prolonged 
training period. The present study was de- 
signed to determine a rank order of perform- 
geometrical 
dial-reading 


ance utility for three classes of 
highly overlearned 


( on a 
For the purposes of our discussion the 
rank order of utility of particular task cues 


as aids to performance has been called “per- 


1eS 


task. 


ceptual cue usefulness.” 

Related to this investigation are a number 
of experiments conducted within the frame- 
work theory (Anderson & 
Leonard, Rappaport, 1957; Senders & 
Cohen, 1955). The terms information, noise, 
and redundancy have proven to be useful vari- 
for the communication 
networks as they 
and lack psv¢ hological 
7). However, their value as hypothetical 
constructs for a perceptual task is question- 
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can 
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able since a quantitatively precise mathemati- 
cal definition appears to be virtually impos- 
‘One man’s noise is another man’s in 
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sible. 


formation” to paraphrase an old adage; 
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A more fruitful approach to the problem 
seemed indicated by adopting the theoretical 
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Bruner (1957) who 
decision-making 


framework outlined by 


perception as a 


considers 


process affected by the situational utility of 
available discriminatory cues. His viewpoint 
provided the basis for our experimental hy- 


pothesis that the level of performance is 
directly related to perceptual cue usefulness 
and only incidentally to the amount of in- 
formation, redundancy, and noise in a given 


task. 
METHOD 
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TABLE 2 
SUMMARY OF SIGNED RANK TESTS FOR PAIRED COMPARISONS 
BETWEEN ANy Two EXPERIMENTAL CONDITIONS 


Pointers Only 
4.0** 
+40.0** 


Correct Responses a T T 


Completed Re sponst 


Pointers + Perimeter 


Corr Responses 


Completed Respons« 


Pointers Reference Ax 


Correct Responses a 


Completed Responses a 
rs + Dots 


Correct Responses a 


Completed Respons« 


Pointers Perimeter 
Reference Axes 
Correct Responses a 


Completed Response 
Dots 
orrect Re Spor 
yleted Res} Or 


Reference 


t Responses a 


npleted Response 


Reference Ay 


meter + Dot 


iation from the actual 
number of completed responses. 


student Ss served in the ex- 


ined a 1 minute dev 
etting 

Subiect 
periment 

Procedure. The test 
to each S. The test 
structions and 


was administered individually 
booklets contained 
four examples of the 
with the correct response. After the 
was assured that the 
the S was instructed to turn the 
work. For the 
were allotted, 


for turning 


written in- 
dials complete 
experimenter 

understood, 
page and begin to 
completion of each page 48 seconds 


task was throughly 


and the experimenter gave the signal 


to the next page 
RESULTS 
All statistical analyses were carried out by 
nonparametric methods.* Overall significance 


The statistical tables have been deposited with 


the American Documentation Institute. Order Docu- 


+ 5.0** Lo + 49 5* + 2:5 


— 107.0 +-108.( 


85.5 +55.0* 


113.0 


28.0** 


38.5** 


+- 103.0 


25.0** 


15 (** 


20.0** 


51.0* 


+-23.0** 


+30) 0** 


-~100.5 


83.5 


of difference among experimental conditions 
was assessed by analysis of variance by ranks, 
differences between any two treatments by 
the Mann-Whitney U test, and the relation- 


ship between the two criterion measures by 
rank order correlation (Walker & Lev, 1953). 

Because the three types of backgrounds did 
not produce a significant performance differ- 
ence, the geometrical cues have been averaged 
over all backgrounds for statistical analysis. 
This method has enabled us to minimize pos- 


ment No. 6549 from ADI Auxiliary Publications 
Project, Photoduplication Service, Library of Con 
Washington 25, D. C., remitting in advance 
$1.25 for microfilm or $1.25 for photocopies. Make 


checks payable to: Chief, Photoduplication Service, 


gress, 


Library of Congress. 
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sible artifacts occurring in the distribution of 
“clock-times” for a given background. 

For tabulation, it was considered to be 
more meaningful to convert the criterion 
measures from average numbers to percent- 
ages. Table 1 shows the rank order, the per- 
centage of correct and completed responses, 
and their correlation for each of the eight 


conditions of cue configurations. 
Although the relationship between the de- 
_pendent variables was positive, it was not 


high. Speed of performance was found to be 
much less variable (maximum difference of 
about 8%) than accuracy (maximum differ- 
ence of about 17°). The significance of the 
between any two treatments has 
been summarized in Table 2. 

The rank order of percentage correct and 
number of completed responses provides an 
index of the perceptual usefulness of the em- 
ployed cues, both singly and in combinations. 

The rank order indicates that optimum 
performance has been obtained in terms of 


difference 


speed and accuracy for the following condi- 


tions: 


1. Pointers plus reference axes 


> 


Pointers plus perimeter plus reference 
axes 


3. Pointers plus perimeter plus dots 


The rank order of all eight dial configurations 
for accuracy is shown in Figure 2. 


DISCUSSION 


The experimental results appear to indicate 
that even though the information necessary to 
make a decision may clearly be present, some 
configurations of cues are of greater utility 
than others as aids to the observer. Perform- 
ance obtained with the two extreme display 
interme- 
diate six other combinations apparently fails 


configurations as well as with the 
to show a trend which could have been pre- 
dicted by measures of information, noise, and 
redundancy as classically defined in math- 
ematical information theory. Accordingly, we 
that an cue-usefulness”’ 
operationally defined in terms of level of per- 


believe “index of 
formance may represent a practical guide for 
visual task designs. There appears to be an 
level of that is 


optimum “cue-usefulness” 

















neither the minimum number of 
sary for making a correct response nor is it 


cues neces- 


the maximum number of redundant cues. 
The fact that 
tained for a visually 


“best performance e’’ was ob- 
“clean” but not min- 
imally cued display is in agreement with good 
human engineering principles (Garvey & Mit- 
1955; McCormick, 1957). The problem 
still remains as to just what defines a “clean” 


nick 


display objectively, and no hypothesis can be 
advanced by the authors as to possible rea- 
sons for the superiority of the particular con- 
figurations used or for the functional ineffec- 
tiveness of the changes in background. Since 
alone is not a 
sufficient criterion, perhaps the answer must 
be sought in figure-ground relationships in 
pattern perception rather than in S-R terms. 


number of redundant cues 





90 Hilde Groth and John Lyman 


Compatibility of cues within a display ap- 
pears to be of at least equal importance as 
compatibility between displays and controls. 

The stability of the obtained hierarchy of 
these cue configurations at various stages of 
training will be investigated in a series of 
long-term experiments. We hope such an 
investigation will enable us to accumulate 
enough data to permit inferences about the 
more basic variables determining the utility 
of a given configuration. 


SUMMARY 


This study was designed to define a hier- 
archy of “perceptual usefulness” of geomet- 
rical cues in an overlearned dial-reading task. 
The hypothesis was postulated that perform- 
ance is a function of ‘‘perceptual usefulness of 
cues” rather than of the amount of informa- 
tion, redundancy, and noise present in a given 
situation. 

Examination booklets were prepared con- 
taining 24 pages with 12 “dials” of the same 
configuration on each page. A 
“dials” had to be 
For each page, 48 


geometrical 
total number of 
and recorded by each S. 
seconds were allotted for completion. 

The task consisted of reading 
on these “dials.” It 
fulfilled the requirement of an 
task and also represented an acceptable lab- 


288 read 


“clock-times” 
was selected because it 


overlearned 


oratory abstraction of a dial-reading task. 


Eight cue configurations and three types of 


background were combined in a factorial de- 
sign with 12 replications of different pointer 
settings for each of the 24 combinations. The 
test was administered individually in a treat- 
ment by Ss counterbalanced design. Twenty- 
one student Ss served in the experiment. 
Results supported the hypothesis, and a 
rank order of “perceptual cue-utility” 
found. The implications of the results for dial 
design have been discussed, and _ reliability 


was 


and generality of the findings at various levels 
of training investigated in further 
studies. 


will be 
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THE EFFECTS OF PERSONNEL REPLACEMENT 
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System Development Corporation 


Almost all human organizations are subject 
to changing membership throughout the 
course of their existence. Problems of labor 
turnover have been dealt with in studies by 
Kangan (1948), March and Simon (1958), 
and Rice (1958). Problems of the effects of 
changing membership have also been the 
focus of small group studies, Simmel (1955), 
Mills (1957),and Borgatta and Bales (1953). 
Both from the industrial approach and the 
small group approach, there is agreement that 
intact groups are desirable and that anything 
that tends to destroy group membership in- 
tegrity is likely to degrade the group’s per- 
formance. 

Quite the opposite point of view appears to 
be held by man-machine system designers who 
treat 


over a 


interchangeable 
behavior. As 
the 


individuals as 
wide 


seem to 
units range of 
Forgays and Levy (1957) point out in 
context of a military system, 

The 
positive features, if one considers the 


unit 


individual-interchangeability policy has many 
economics of 
training and combat. The 


notion that a commander may be free 


unit periormance in 
to change the 
without 


under his command 


motivation decrements connotes a 
flexibility. The integrity, 
implies that a unit is a unique organization 


membership of units 
performance and 
desirable concept of crew 
however, 
and will suffer in performance if 


occur (p 1) 


membership changes 


Forgays and Levy go on to demonstrate, by 


means of data collected on turnover among 


bomber crews, that, in general, crews with a 
medium number of membership changes had 
better combat performance scores than the 
high- or low-change crews. 

Fisher (1917) formulated the problem of 
turnover in the cost 
production while replacements were learning 
to achieve the production levels of the experi 
enced men they replaced. Kangan (1948) also 
that turnover in section of a 
might effect the another 
section, especially where the work was organ- 
ized by the “chain” method. 


terms ot due to loss of 


suggested 


plant 


one 


output of 


More recently, Duncan (1955) reported an 
called “skill dilution” 
used to measure the effect of labor turnover 
on the “pool” of skill available at any time. 


index which can be 


The effect of turnover on an information- 


processing system, according to the concept of 
skill dilution, ought to be fewer items of in- 
formation correctly processed when compared 
with a similar system without turnover. Fur- 


thermore, since many information-processing 


systems are organized by the ‘“‘chain” method 


(i.e., information enters the system at one 


place, is processed and passed on reprocessed 


and passed on again, etc.), the degradation in 
performance should be reflected in the system 
output unless the other members of the sys- 
tem can overcoming the 
“bottleneck” 

The 


preliminary step in the experimental investi- 


develop ways of 
effect of the new member. 
study undertaken as a 


present was 


this area where membership was 
an independent 


variable. Since this study was undertaken as 


gation of 
deliberately manipulated as 
part of a larger research program directed at 
improving the Air Defense Command’s Sys- 
tem Training Program,’ the questions posed 
were specific to information-processing sys- 
tems and training. Specifically, the questions 
were : 


1. What effect does the replacement of per- 

sonnel in a complex information-processing 
system have on the ability of the system to 
accomplish its mission? 
2. To what extent do the concepts derived 
from the analysis of labor turnover account 
for the turnover effects in information-proc- 
essing systems? 

3. To what extent can the expected degra- 
dation in performance be overcome by various 


training methods? 


This study was intended to get preliminary 


answers to the above questions. Although 


See Goodwin | 7 lor l lescripti 


m 
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several independent variables were intro- 
duced, only those combinations of variables 
which seemed, a priori, to be potentially most 
informative were investigated. Therefore, the 
resulting “experimental design” was incom- 
plete. Nevertheless, the results seemed to 
have important implications for man-machine 
information-processing systems. Consequently, 
this study has been reported in its present 
form despite the fact that not all combina- 
tions of all variables were studied and there 
was little or no replication for those combina- 
tions that were investigated. 


PROCEDURE 


The seven-man information-processing system used 
in this study was a simple analog of the surveillance 
and identification sections of the manual aircraft 
control and warning system (ACW). The major 
function of the laboratory analog was to establish, 
maintain, and interpret a simulated air traffic dis- 
play. The ACW system was designed to perform 
these same functions for real air traffic. The goal of 
the crew (and its military counterpart) was to 
process accurately information received on 
so that it could be compared with data received from 
another source. 


“ ” 
scopes 


System Description 


The input white dots 
against a black rectangular grid at the scope reader 
position. The position of any single blip was located 
in terms of a rectangular grid. A frame of blips was 
projected every 20 seconds, 120 frames constituting 
one problem. Blips appearing in sequence represented 
tracks 


consisted of projected 


paths of airplanes) 


Blips that did not appear 
in sequence were referred to as ‘ 
The 


decided when a se 


“noise.” 
input 


of blips in suc 


screen and 
essive frames 
track 


scope reader scanned the 


uence 


represented a track. He then affixed a heading, 


TABLE 


RACTERI 


TICS OF 


Tassone 


number, and time on the plastic screen alongside the 
trail of blips. On subsequent frames he updated the 
position of the track with appropriate symbols. 

This track information was transmitted through a 
series (or chain) of data-processing positions as fol- 
lows: The scope teller read the information which 
had been affixed on the input screen by the scope 
reader and transmitted it over voice phone to the 
conversion plotter who was plotting in Cartesian 
coordinates on a large plotting board, whereupon the 
conversion teller converted the information into 
polar coordinates and transmitted by voice phone to 
a second remote plotting location. There the polar 
plotter plotted the information on a polar coordi 
nates board 

The track correlator was given auxiliary informa- 
tion concerning the programed flight plans of cer- 
tain tracks. When the positions of the track and the 
flight plan information were sufficiently close (within 
15 miles and 2 minutes), the track correlator de- 
clared the track to be “known.” Otherwise, the track 
was declared “unknown.” This information was 
given to the supervisor who passed it on to the other 
crewmen so that they 
tain “known” tracks. 


would not continue to main- 
Each experimental session involved the operation 
of the laboratory system by a group of subjects (a 
for 40 minutes. Following this, the crew met 
in a separate room where they were provided with 
formal knowledge of performance results and were 
allowed to discuss these results as a group 


crew) 


Turnover was introduced by replacing one crew 
member prior to the next experimental session. A 
different position was “turned over” each time. This 
turnover rate resulted in complete turnover of a crew 
at the end of eight The control condition 
consisted of crews operating the system under equiv- 
alent conditions without replacement of personnel 
These stable crews were labeled “A” crews, all other 
crews were turnover crews (B, C, D)—see Table 1 
Different turnover crews had their personnel replaced 
in different order 


sessions 


the order difference was 
not confounded with the other experimental vari 
ahle 
abdies 


However, 


1 


CREWS 


Turnover Crews 





Replacement in 


System Ex pe rience 


In terms of the “skill dilution” concept there must 
be a skill to be diluted by turnover, if 
turnover is to have an effect. Therefore, the amount 
of system experience of the original members of each 
crew was varied for different crews. Table 1 shows 
that there were two stable crews (A, A:) and two 
turnover crews (B, B:) whose original members had 
no system experience when they were formed. All 
other crews (As, C, D, Ci, Di, Be) had original crew 
members with system experience ranging from 1-15 
sessions. Crews Be and As were matched on this 
variable, since all original members had had eight 
sessions of system experience prior to becoming mem- 
The other four (C, D, Cs, D:) 
were roughly matched in that all had a similai 
e in their initial comple- 


“pool” ot 


bers of these crews 
wide 
range of system experic 
ment. 

In a similar way, the amount of system experience 
of the replacements for the turnover was 
varied for one crew, Be. Its replacements came from 
Crew B, and 
perienced in the same position. All other turnover 
crews (B, B;, C, D, Ci, D:) had replacements who 
had no when they re- 
ported for work with the crew 


crews 


were, therefore, variously system ex- 


previous Sy stem experience 


Training 


All subjects had ie form individual instr 
tion before assuming their positions as members o 
rew. The original complement of Crews A:, By, 
Bo, C:, and D, received standardized orientation and 
practice. A general description of the 
ach group followed by a tour of 
ical layout. Individual position instruction was pro- 


system was 


the phys- 
vided by use of 
vaches. The practice problem was 
imes with interruptions for questior 
owed by a 20-minute dis 
nts for these 
as the 


ul 


a practice problem and individual 


{ 
t 
l 


on peri 
crews had the ime 
members Ce] ior 


} 


original crew 
going into D The € 


} their 
individual instruction fro 


re placed 


individual 
I 
his instruction occurred dt *xperimental 
session just prior to the ne whe they assumed 
their duties.* 
4 different individual t 
for the original 
and D. All 
were 


V ided 
Crews A, B, C, 
individually and 
practice problem with less than a certain numl 
errors per run (specific to each position) with 
repetitions of the problem. Thu individ 
trained to a criterion of 
his crew. The former training procedure 
equal individual training experience, the 
sulted in equal individual performance proficiency 
After each run, all crews had an opportun 


require form 


joining 
resulted in 


latter re 


peritormance 


This treatment was introduced it 
cover, with a minimum number of 


of variation of training method 
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discuss their performance in a face-t ace meeting 
which averaged 45 minutes. This was a post-« 
session 
10 minutes post-exercise and 35 minutes pre-exercise 
Standard knowl- 
edge of results was reported to all crews. This was a 
number of tracks correctly 
processed by the system during the trial just com- 
pleted. Experience with the Air Defense Command’s 
System Training Program and previous laboratory 
studies indicated that more performance improve- 
ment could be expected if such system performance 
summaries were supplemented with detailed informa- 
tion on the f 
each track. This method was used with Crew D; who 
had both the system summary and 
position-oriented knowledge of results. Finally, it 
was decided that position-oriented kno 

sults could be improved by 


the data collectors which w 


xercist 


for all crews except D;, which was allows 


for this purpose (see Footnote 2 


box score” stating the 


performance of each crew member on 


performance 


written 
behavior which 

were most need of improvement. Thus, in 

to a detailed 

in the po 

individuals 


attention on those aspects 


addition 


edge of 

ing the part or 

This method 
had the 


plus written constru 


to improve first 
D, which, as a result, 
as Crew D 
Therefore, in terms of the 

Table 1, Crew D, replicated Crew 
replicated Crew C. However, the 
he nature of the knowledge of r 
Crews C and C, had a minima 
had more complex knowledge of re 
had the most complex fe 


Dat 1¢ ‘olle cle d 


tion lir 
S¢ rved 


( 


This statistic 


task units 


upervisor 
f uircraft tracl 
ber requiring processing. Thi 
100, w 


system or 


is called 


centage 


RESULTS AND DISCUSSION 


The results of this study are shown in Fig- 
and 3. The performance effective- 
ness of each crew has been plotted for each 


ures 1, 2, 


experimental session. The crews without turn- 
over, A, A,, and A 


experience as did two of the turnover crews, 


appeared to improve with 


B and B,. The other five turnover crews indi- 
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Performance of inexperienced crews 


cated either a failure to improve or even a 


decline in performance effectiveness with sim- 


ilar experience. These results suggest the 
anticipated answer to the first question posed 
by this study. The effect of crew turnover on 
a complex information-processing system was, 
for some of 


degrade its 


the experimental conditions, to 


ability to perform its mission. 





Tassone 








experie need crews 


However, the amount of degradation ap- 
peared to depend on the amount of skill de- 
pletion produced by the turnover. 

Turnover crews with no initial system ex- 
perience, B and B,, performed at least as well 
and seemed to improve as much with practice 
as comparable stable Crews A and A, 
Figure 1). This condition represents minimal 


(see 





























Fic. 3. Performance of 


turnover crews 


with a wide range of experience 





Replacement in an 


skill dilution because the B and B, crews had 
no original pool of skill. The rate of retarda- 
tion in the accumulation of system experience 
by these crews did not appear to affect their 
ability to improve with practice. 
The skill dilution in this 
curred in crews originally 


most study oc- 
composed of men 
with system experience and whose replace- 
ments were men who had only individual 
training prior to joining the crew. This con- 
dition was represented by turnover Crew C 
C,, D, and D, (see Figures 2 and 3). All four 
crews fall short of the performance expected 
of stable crews with comparable experience. 
Crew A, was designed as a control for Crews 
C, and D,. Unfortunately, 
turnover forced this crew 


unanticipated 
to terminate short 
of the number of experimental sessions neces- 
sary for The theoretical 
stable crew curves shown in 
Figures 2 and 3 used to evaluate the 
effect of this level of skill dilution in lieu of 
(The deriva- 
tion of these theoretical stable crews perform- 
curves will be disci in a later 
section. ) 

With the possible exception of some of the 
results from Crew D 
erable degradation in performance associated 
with a kind of turnover which produced a 
large amount of skill dilution. 
Crews C, and D, with A.,,. 
and D with the theoretical stable crews. Note 
that Crew B. was intermediate in both skill 
dilution and performance 


this comparison 
performance 
were 


comparable experimental data 


ance 


these data show consid- 


(Compare 
Also compare C 


degradation.) ‘n 
view of this analysis, the second question 
posed for this study has been partly answered. 
The concept of “skill dilution” could be used 
to account for the direction and relative mag- 
nitude of the effect of turnover on an informa- 
tion-processing system. 

The industrial and the small groups studies 
in this area suggested that the effect of turn- 
over ought to extend beyond the position re- 
placed, especially in a chain-type organization 
where the input to one man is the output from 
another. The following method was used to 
determine whether turnover in one position 
was associated with a deviation from normal 
performance in another position. 

One-half of the data, namely, alternate ses- 
sions, were used to obtain an estimate of nor- 
mal performance improvement with practice 
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sing Crew 95 
for each position. A negatively accelerated 
exponential curve was fitted to these data by 
the method of least squares. Checks for inter- 
nal consistency were made by using the other 
half of the data, as follows: (a) The number 
of experimental sessions (experience) accumu 
lated by each crew member on each day was 
entered in the performance curve for his 
position. This resulted in a predicted perform- 
ance information- 


score for each of the six 


processing positions in th 


system. (4) The 
product of these six scores yielded the the- 
oretical system performance score for the day. 
The rank-order 
theoretical scores and the observed perform- 
ance of the 


correlation between these 


stable crews was +.89 (p < .01). 


The size of this correlation justified the use 


of these performance equatio1 s to predict 
stable crew performance for experience 
beyond those sampled expe riment lly (set 


Figures 2 and 3). 


theoretical stable curves 


t } 


The difference between « xpected al d actual 


performance of each member of the crew was 


calculated for each These differences 


session 


were arranged according to the position 


turned over. If the turnover of one position 


her position 


I 
leviation f the 
deviations tor tne 


interacted on anot the perform- 
affected 


a different popula- 


ance position 
should be a sample from 
tion than other performance deviations for 
the given position. This sample difference was 
analyzed by the Mann-Whitney U test. The 
results four significant 
One of these, the effect of the 
conversion plotter on the track correlator, was 
thought due to the 
formance effectiveness 


showed interactions 


with p < .05. 


method of measuring per- 
The other three inter- 
actions were: A new polar plotter reduced 
performance of the conversion teller, a new 
conversion plotter reduced the performance of 
the scope teller, and a new scope teller re- 
These 


interactions have the following common prop- 


1 


duced performan¢ e of the scope reader 
erties: (a) the degraded position is closer to 
the input of the system; (0) a tele- 
phone link between the affected positions; 
(c) the degraded cannot readily 
store information and distribute the load over 
time, he 


there is 
position 


having 
greater demands on his memory storage 

The third question posed for this study 
was “to what extent can the expected degra- 


becomes overloaded by 


dation in performanc¢ e be overcome by various 
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training methods?” This question remains un- 
answered. None of the methods tried in this 
study resulted in adequate performance im- 
provement with practice by crews subjected 
to turnover resulting in gross skill depletion. 
The very best that was achieved was in the 
case of Crew D which for 10 out of 13 ses- 
sions performed at levels equal to or better 
than its performance prior to turnover. The 
training of this crew included individual 
training to a criterion, and knowledge of re- 
sults consisting of a system summary, posi- 
tion-oriented feedback and problem-focused 
written critiques. 

Finally, the results of this study contained 
some additional evidence against the postu- 
late that individuals can be treated as inter- 
changeable units over a wide range of be- 
havior. In order to perform the interaction 
analysis described above, it was necessary to 
derive “learning curves” or “performance im- 
provement functions” for each position in the 
system. Due to the lack of data for individual 
subjects at all levels of experience, it was 
necessary to “pool” subjects and derive “typ- 
ical” performance functions. If individuals 
were interchangeable over the range of be- 
havior sampled in this study, these “typical 
position performance functions” should pre- 
dict the performance of the turnover crews 
with as high a correlation as was found for 
stable crews. This did not happen. The rank- 
order correlation between predicted and ob- 
served turnover crew performance was +.40 
(p'< .01). It was significantly different from 
zero, but left a great deal of variability to be 
accounted for. One of the factors entering 
into this unpredicted behavior was, of course, 
the interaction effect described above. Even 
when the predictions were adjusted to allow 
for this factor, the correlation was increased 
only to +.48, a slight gain indeed. This lack 
of predictive power on the part of the typical 
performance curves suggests that the range of 
behavior for which individuals were inter- 
changeable was small in this system. 

Another implication of this low correlation 


is that turnover introduces changes in impor- 


tant determinants of performance 
other than skill levels and forced storage. 
Further research is required to determine the 
extent to which such concepts as crew motiva- 


system 


Tassone 


tion, crew cohesion, crew standards of ac- 
ceptable performance, and others derived 
from sociopsychological research may account 
for the behavior of organizations subjected to 
deliberate membership manipulation. 


SUMMARY 


The data indicated that, under certain con- 
ditions, crew turnover degraded the ability of 
a complex information-processing system to 
accomplish its mission. Whenever turnover re- 
sulted in little “skill dilution” the perform- 
ance of the system was not greatly affected. 
Whenever the turnover resulted in consider- 
able “skill dilution” the system either failed 
to improve or declined in performance over a 
period of training sessions. In a chain-type 
organization the effect of turnover was found 
to extend beyond the position replaced. A 
degradation was found in the performance of 
the position passing information to the re- 
placement whenever the passing position had 
inadequate information facilities. 
None of the training methods tried in this 
study was very effective in counteracting the 
effects of turnover. 


storage 
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THE REACTION OF INTERVIEWERS TO FAVORABLE 
AND UNFAVORABLE INFORMATION 


B. I. BOLSTER 


AND B 


M. SPRINGBETT 


University of Manitoba 


Incidental findings (Bloom & Brundage, 
1947; Crissy & Regan, 1951; Newman, Bob- 
bit, & Cameron, 1946; Springbett, 1958) in 
studies of the employment interview have sug- 
gested that interviewers react more strongly 
to unfavorable information about the appli- 
cant than they do to favorable information. 
In addition, evidence of primacy effects has 
been produced (Springbett, 1958) i.e., the 
evaluation attached to an item of information 
presented first carries more weight than if it 
is presented later. 

This experiment is achieve 
three ends: (a) a direct, systematic and, 
perhaps, “purer” test of what has been in- 
cidentally discovered and reported concerning 
favorable and unfavorable information; (0) 
to identify some of the variables governing 
primacy effects; and (c) to make a rough 
check on the assumption of a “negative set” 


designed to 


used to explain previous results (Springbett, 
1958). 
these aims here will 
simplify the task ahead of describing the ex- 
perimental design, materials, and procedures. 

1. Testing the effects of favorable and un- 
favorable information will best be accom- 
plished by using a single medium of commu- 
nication. The printed word seems best suited 
here for it rules out many variables associated 
with the physical presence of the applicant. 
Further there is evidence of the importance 
and adequacy of written verbal communica- 
tion in the interview (Geidt, 1951; Spring- 
bett, 1958). In addition some equivalence of 
weight or importance of the favorable and 
unfavorable information is required before 
differential interviewer rating shifts can be 
attributed to favorability or unfavorability 
per se. 

Z: 


Some discussion of 


Primacy effects in the interview have 


been shown. In an attempt to explain why 
primacy effects based on judgments of ap- 
pearance were less than those based on the 
application form it was suggested (Spring- 


bett, 1958) that in committing himself to a 
highly favorable rating on appearance the 
interviewer felt he was committing himself to 
a risk and consequently became more sensi- 
tive to negative information. One might argue 
on this basis that, more generally, deviation 
from the noncommittal attitude represents 
“risk” and that the further the interviewer 
deviates from this base, in either a favorable 
or unfavorable direction, the stronger be- 
comes the tendency to regress to neutrality. 
This assumption may be tested by inducing 
interviewers to build up various degrees of 
acceptance or rejection and then introducing 
information of a contrary nature. 

3. If, indeed, there is a “set” to find and 
favor negative evidence in the interview, in- 
dividual differences in the strength of the set 
may be expected. If such exist, those with a 
high degree of the set will tend to place high 
ratings on unfavorable information and rela- 
tively low ratings on the favorable. Those 
with a lesser degree of the set would place 
lower ratings on the and rela- 
If judges 
on “negative 


unfavorable 
tively higher ones on the favorable 
were ranked high to low 
set,” their ratings on 


information 


from 
an item of unfavorable 
be distributed high 
their ratings on favorable in- 
formation would run from low 


would from 


to low, while 
to high, i.€ 
there would be a negative correlation between 
ratings on 


favorable and unfavorable infor- 


mation. 


METHODS AND PROCEDURES 
The setting in which 

that of assessing the 

versity contingents 
Training Corps. Thi 
tages: first, there was 
pe rienced officers 
1959, p. 16) operating 
eotype as to what 
Sydiaha, 1959 


available 
in relation to a comr 
constitutes 


personnel 
a good officer 
and, in addition, extensive files 
interview reports provided a realistic source of 
sonal data for the 
materials 


construction ol experimen 
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As indicated above, 
municated by the 
ituation. The 

to the 
pect to its importance in decision 
details of method 
into two parts, that concerned with the 


intormation w 
printed word in the 
information had to be | 
scaled with 


naking 


ind procedure 


interviewers’ task and 


quently, the 
construction 
of written protocols and the rating form, and that 


concerned with the experiment proper 


Construction of Protocols and Rating Form 


Selection and scaling of items. Over officer 


cadet. selection reports combed to secure 15 


items of relevant information evenly divided betw 


were 


favorable and unfavorable. These were edited for 


and to the end that each presented 


clarity, ambiguity, 
item of information 
These were arranged in two lists 


a single 
favorable and 
unfavorable, and a preliminary ordering as to 


parent intensity (importance) followed. Some 
construction and reconstruction was necessary (a) 
fill in gaps in apparent intensity, (b) to have for 


each favorable item a counterpart ol the 
tensity in the unfavorable list, (c) to provide a 
ple of items from each major area explored in the 
systematic officer cadet selection interview 
100 items remained 

The next step was to scale these items in I 
importance. Each of 


Work 


personnel offi 


their relative intensity or 
items was typed on a separate slip of paper 


] 


ependently five full-time army 
he 


, 
t 

rs first of all sorted th two pile 

favorable and unfavorable 

say,” was ind iten 


discarded 


used 
Second, each 
able) was sorted into 
by equal appearing int 
ing interview decisions 


From these operation 


plac ement agreement 
I 


retained. This reduces 

category to 
The next ope: 

volved 25 full-time arm 

intermediate judges 
Each group of 


ition 


pair-comparison fort 


compared with every 


M.S 


adjacent Ferguson 1952, p. 309) From 


values were calculated using Thur 


group see 
these results scale 
stone’s Law of Comparative Judgment—Case V as 
Guilford, 1954, 
17 ) The 


were transformed to a scale 


umption. (For these procedures see 


Ch. 7; the 


obtained scale 


statistical procedures are on p 
values 
with lard deviation of two 
Constructi Using these 60 item 
and item values 12 protocols of interview inform: 
items 

4 and B) each contained 1 

graduated from low to high scale 

except that they cor 
total scale 


equal, within one-half 


tion were prepared, each containing 10 
Two of them 

items 

wo (C and D) were sim 


tained unfavorable items values in 


each of these 
ot 1° 

The remaining eight protoco A 
X, Y) are 1 
the first five 
unfavorable) the last five ar 
Each group of five iter 
ither high or low 
each alue 
of the low 
composition 


protocols 


xs 
ich protocol 


LI 
rie 


called divide 


items are OI one or 


lavora 
yposite cat 
egory classified a 
he same 


+} 
snows tne 


rroup ¢ | t 
} 

Construction of the i ( In view of the 
speculations in th trodt ncerning primacy 
l it was desired to hav protocol informa 
tion presented against a background of an initial set 
Three “sets” were 


ance”—this 1 


employed a set ol accept 
giving the 
information th the hypothetical candidate 


M score of ] S n ipproxim 


rt 


preliminary 


i set of “rejection” prod 1 by 


scort n” produced 


ing an M score 


l waived only 


assign 
nimal and 
exceptional compensat 


qu ilities 


M 


rroduced by 


as these score were issigned 


rating form has three vertical lines represent- 


leparture points of ept, utral, reject 
| lines one 
marked 
accept point lie eight 
respectively, of the 

I minimal” ré 


tending | 1 the 
additional eight inter 


horizontally 


tor each item in rotocol I nes are 
ff in intervals 
intervals to the 
neutral point. Ex 
ject and accept points 


ction or accept 
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The Experiment Proper 


The subjects (interviewers 
form and an M score fixing 
parture. Each had a protocol enclosed in a 
open-end envelope. The then exposed the 
first item of the protocol and checked the rating 
scale, on the line for that item, to show the shift, if 
any, in their evaluation of the hypothetical candidate 
induced by that item 
all 10 items of the 
oixteen 


their first point of de 


subjects 


This procedure continued 


protoc »] 


raters were randomly 


The 


sessions of four protocols eact 


groups of four protocols 
of the four subgroups receive 
ferent that possibl 
counterbalanced by way 


orders sO 


square 
The 16 subjects were army 
ficers. All had previous training 
psychology and their application 
previous experience in the selectior 
Eight had two or 
four held one degree, and 
matriculation plus 
None had taken part in 
about the preliminary phas« 
It may be noted that 


the procedures was cart 


cadets 


some I 


ensure smoothness of 


materials, and mode o 
nificant difference in 


protocols 
RESULTS 
The care taken 


protocols arose out of the concern that they 
be realistic 


in the construction of the 


and that the weightings assigned 
to the items would be valid in the sense that 
they would elicit proportionate shifts of rat- 
ing in the experimental situation. This can be 
checked by weights of the 
items with the amount of shift they elicited. 

As will be 


correlating the 


below there are 
primacy effects in that the first items in 
Protocols A, B, C, and D and the sixth item 
in the remaining protocols produced shifts in 
ratings far greater than subsequent 
The sixth item is where the shift from favor- 
able to unfavorable information, or vice versa, 


seen strong 


items. 


occurs. 
The 
these items 
(p < .05). 
The _ correlation 
weights of 


weights 
shifts is .797 


correlation between the 
and the rating 
between the summed 
Items 2 to 5 and to 


rating shifts is .820 (p< 


10 versus 
05) 


The ratio of item weight to rating shift for 


and Unfavorable 


Information 09 


the “primacy” items is 2:1, i.e., two units of 


1: TT} 
I 
I 


item weight elicit one unit of rating shift. The 


corresponding ratio for the remaining iten 
is 6:1 
orable 
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Information 

summary of results is shown in Figures 
s 3. In Figure 1 the results of Protocol A 
(favorable) and C 
Procotol A starts from 
“reject” (M 
that it takes on the average 8.8 items to shift 
Protocol ¢ 


ol accept (M score of 


The 
? 


(unfavorable) are shown 


a departure point 


score of 155 It will be n 


the ratings from reject to accept 


I 
Starts 


from a rating 


17 Che shift from accept to reject re 


juires 
on the average, only 3.8 ite At the en 
j ‘ 
| 


] items Protocol A has induce 
otocol ¢ 


a 
ing shift of 17.5 units, Pr 
t 12.67 (f | 

the results of Protocols 


Protocol B 


rating of accept and D 


In Figure 2 
LD are shown. In this case 
able) starts from a 


from reject. i.e., the information confirms the 


1 


initial set. Here, eigl 
of scale were available 


€ 
direction of the 
to the rater to 
in increasing degree of acceptance or rejec- 


‘ a . 6 os ar 
tion. By the end of 1 e iten l 


Fi n ur n two basic pre 
and C) ition. (Mean of 
Units noted on h curve represent tl 
weight of 
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nformation 


Legend: ---- unfavorable 


—— favorable information 


Maximum possible rating 





— . 
° Accept 8) or Reject (0) start line 
+ $s 6 ? 


Items 


Fic. 2. Rating curves on two basic protocols (B 
Mean of 16 raters 
represent the item 


designated items.) 


and D) of selection information 
Units noted on 
weight of the 


each curve total 


Protocol D reached the maximum point, while 
only 6 of the raters using Protocol B had done 
so. The total amount of shift induced by 
Protocol D is 6.9 units and 5.8 units for B; 
t 7.70 (p < OF 3. 

Figure 3 summarizes the data derived from 
the eight divided protocols. In half the proto- 


ccept or Reject 


Accept or Reject 


Items 


Fic. 3. Summary rating curves where equivalent 
amounts of information followed the same amounts 
of information of opposite sign. (Units noted on each 
curve represent the total item weight of the desig- 
nated items.) 


cols the first five items were favorable, in the 
other half unfavorable. The point of depar- 
ture for the raters was a neutral rating (M 
score of 160). At the end of the fifth item the 
positive protocols elicited an average rating 
shift of 9.5 units, the negative protocols 10.8 
units; ¢ = 5.13 (p < .001). 

The second half (last five items) of these 
protocols were of contrary sign to the first 
half. The shift in rating induced by the posi- 
tive items was 12.3 units, by the negative 17 
units; ¢ = 3.33 (p< .01). Interpretation of 
these results is complicated by the fact that 
the level of acceptance or rejection at which 
the final five items were introduced varied by 
virtue of being dependent upon the ratings 
reached in the first five items (see below). 
With this factor controlled the differences 
would be greater than those obtained. 


Primacy Effects 


Following the line of thought indicated in 
the brief introductory remarks, the first con- 
cern is to determine whether primacy refers 
not simply to the first item in a series but 
rather to the first that challenges an 
existing set. Second, it was suggested that 
primacy effects so defined would increase as 
a function of the strength of the set it chal- 
lenged, i.e., the higher the rating at the point 
of challenge the greater the primacy effect. 
Another factor may be the weight or impor- 
tance of the item that challenges the existing 
set, i.e., the question is whether the primacy 
effects of the lightweight and heavyweight 
items, as shown in rating shifts, are dispro- 
portionate to their weight. 

Primacy related to shift of direction in the 
evidence. The clearest demonstration that an 
item has disproportionate effects in shifting 
ratings when it challenges an existing set is to 
be found in a comparison of Figures 1 and 2. 
In Figure 1 both the positive and negative 
protocols challenge a set based on M scores. 
In Figure 2 the protocols do not challenge, 
but, on the contrary, 
on M scores. 

The weights of the first items for A and C 
are 6.5 and 6.7; for B and D, 6.5 and 7.7. 
The shift in ratings due to the first items in 
A and C are 3.7 and 8.8, respectively; for B 
and D the corresponding figures are 0.2 and 


item 


confirm the sets based 
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0.4. On the average the shift induced by a 
change of direction in the evidence is approx- 
imately 25 times as great as that when the 
direction of the existing set is confirmed. 

Primacy as a function of the height of the 
preceding ratings. In the divided protocols, 
Item 1 challenges a “neutral” set either in the 
direction of acceptance or rejection. At the 
end of Item 5 high ratings of acceptance or 
rejection have been induced. Item 6 then 
challenges the ratings established at Item 5, 
i.e., Item 1 has to produce its effects against a 
low rating, Item 6 against a high rating. 

Table 2 shows the ratio of rating shift to 
item weight for Items 1 and 6 in each of the 
divided protocols. The average ratio for Item 
1 is .411, for Item 6 it is 637; #¢= 21.15 
(p < .001). 

Other lines of evidence involving all the 
last five items show consistently that the rate 
of return to the “neutral” line is a function of 
the height of the rating which is challenged 
and that this rate decelerates neutral 
line is approached. 

Primacy as a 


as the 
function of item weight. 
Table 3 shows the ratio of rating shift to item 
weight for low value items and for high value 
items. It will be noted that the low and high 
value items are equally distributed between 
Items 1 and 6. The results show that the unit 
of rating shift per unit of item weight is sig- 
nificantly greater for high value 


3.37 (p< 01). 


items: f¢ 


rABLE 2 


RATING Surrr INDUCED 


IN THE DIVIDED 


Rati 


Iter 


Means 
Standard deviations 


Note ¢=21.15 


Information 


TABLE 3 


WEIGHT oF ITEMS oD 
BASED ON Items 1 


RELATIVE 
SHIFT 


EFFECT OF RATING 
AND 0 
oF Divipep PROTOCOLS 


ti 


Rati 


* Low ilue ite 
> High value ite 


Correlation of Ratings Based on Fat 
and Unfavorable Information 


As indicated in the introduction, individual 
differences in negative set, should result in a 
negative correlation between the magnitudes 
of rating shift induced by favorable and un- 
favorable information. 

Table 4 shows three sets of correlations for 
between 
Items 1 and 6 which are of opposite sign, 
(5) between the cumulative rating at Item 5 
and the rating at Item 6 (again of opposite 
sign), (c) between the cumulative ratings at 
the end of Item 5 and that at Item 10. (Each 
cumulative rating represents the terminal 
point of a five-item series and within each 


each of the divided protocols: (a) 


protocol the two series are of opposite sign.) 


All but four of the correlations are positive 
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TABLE 4 


INTERCORRELATIONS (r) 
At CRITICAL PoINTs Oo} 


Desc ription of 
Rating Shift 


Negative* Positive 


High! 
Low 
Lov 


Mear 


Over-all Mear 


and the average correlation for each set is 
positive. As these averages are based on eight 
correlations which in turn are based on an 
N of 16 it seems likely that a true positive 
correlation between ratings based on 
favorable and unfavorable information. The 
implications of individual differences in nega- 
tive set- are not borne out, rather, the results 
imply differences in readiness to react to in- 
formation. Those whose reactions are strong 


exists 


to negative information react in a relatively 
favorable information. In 
absolute terms, though, reaction to negative 
items is greater than that to positive. 


strong fashion to 


DISCUSSION 


Before commenting on the main results it 
worthwhile to draw attention to the 
the earmarks of cross- 
validation are evident in the substantial cor- 
relation between the item weights assigned by 
one set of experienced personnel officers and 
the rating shifts induced by these item weights 
when used by a second set of personnel offi- 
cers. It tends to confirm Sydiaha’s (1959) 
suggestion of a stereotype, i.e., that there is 
a commonly shared standard amongst army 


seems 


protocols. Some of 


BETWEEN SHIFTS OF 


RATING BY SIXTEEN RATERS 


E1cut DiviweD PROTOCOLS 


source 


Item 1 vs. 6 Item 5 vs. 6 Item 5 vs. 10 
612 
144 


285 


256 607 
190 


572 


605 
120 
182 


833 329 


063 424 


personnel officers defining the “good soldier’: 
this is a necessary condition for the agreement 
found in this study between item weights as- 
signed in an abstract, analytical situation and 
rating shift induced by the item weight when 
assessed against a background of other in- 
formation. 

The main results have some interest in that 
they confirm earlier findings (Springbett, 
1958) but indicate modifications of 
their interpretation. 


some 


First of all, there is clear-cut evidence that 
shifts in rating in the direction of rejection 
are more easily induced than shifts in the 
direction of acceptance: there is a differential 
sensitivity to negative evidence. However, the 
logic of the earlier interpretation predicting a 
negative correlation between ratings based on 
negative and positive evidence finds no em- 
pirical confirmation. Rather, it would seem 
that as the interviewer commits himself and 
deviates further from a noncommittal position 
the more radically he reacts to information 
which threatens the validity of his commit- 


ment. He does react more readily to negative 
than to positive evidence, but differences be- 
tween interviewers seems more accurately de- 
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terms of a readiness to commit 
themselves, both negatively and _ positively. 
However, those most ready to commit them- 
selves are quickest to regress to the noncom- 


scribed in 


mittal position in the face of contrary evi- 
dence. 

The results in relation to primacy effects 
confirm their presence but suggest a modified 
definition of the term. Those items inducing 
a rating shift disproportionate to their im- 
portance did so only when they were the first 
to challenge a rating to which the interviewer 
was committed. As it operates in the inter- 
view situation primacy refers to the first 
change of direction in the evidence. The mag- 
nitude of these effects then become a function 
(height of rat- 
ing). It is also a function of the weight of the 
challenging item. It is somewhat of a surprise 
to find that “heavyweight 
greater rating shifts pe 


of the degree of commitment 


induce 
unit of weight than 


items” 


do “lightweight items.” 

In terms of prac tical considerations it seems 
reasonable to suppose that in those interviews 
where on-the-spot 
factors relating to primacy effects 
operate as they have done in this 
mental situation. It is not at all certain they 
would so operate when decisions are deferred 


decisions are made the 
would 


experi- 


and all of the information is reviewed before 
making a decision. the extent 
that these findings generalize to the interview 
in real life, they point to a danger area, i.e., 
that an item of information, or the uncover- 
ing of some characteristic, 


However, to 


end of 
the interview, which runs counter to the gen- 


toward the 


eral trend of evaluation is apt to exert undue 


influence—undue in the sense that it will 
carry more weight than if it had been encoun- 


tered earlier. 


and Unfavorable 


Information 


SUMMARY 


This paper reports an investigation of the 
effects of favorable and unfavorable informa- 
tion, and an analysis of primacy effects in an 
experimental situation analagous to the em- 
ployment interview. Earlier findings of inter- 
viewer sensitivity to’ negative evidence art 
confirmed but the 
modified. Primacy effects are 
related to the first shift in direction of evi- 
dence; magnitude of effects are shown to be 


earlier interpretation is 


shown to be 


related to the degree of interviewer commit- 
ment at the point of shift and the weight of 


+ 


the challenging information. 
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The present investigation is concerned with 
the relationships between a number of objec- 
tive tests (as defined by Cattell, 1957b, pp. 
225, 897; Scheier, 1958) of personality, soci- 
ometric ratings, and an index of frequency of 
sick bay visits. Previous studies have demon- 
strated that certain personality test scales are 
related to the incidence of somatic illness 
(Staton & Rutledge, 1955) as well as to the 
nature of the particular disability group 
(Wiener, 1956). The correlations obtained 
with frequency of dispensary visits have gen- 
erally been small, and studies such as those 
above have been limited to the self-report, or 
questionnaire, approach to personality ap- 
praisal. 

In a sample of 95, Staton and Rutledge 
(1955) obtained point biserial correlations 
between a group having a high frequency of 
infirmary visits and a low frequency group 
and MMPI scales. Correlations ranged from 
.05 to .36 with four out of nine reaching sig- 
nificance at the .05 level or better. As might 
be expected the highest correlation was found 
with the Hypochondriasis scale. When the 
high frequency and low frequency groups 
were considered by sex, none of the r’s 
reached significance in the male sample while 
two of the r’s were significant in the female 
group. The authors suggested that perhaps 
certain personality-physical illness associa- 
tions are more prevalent among groups of 
women than men. Wiener (1956) obtained 
significant differences between several disabil- 
ity groups on various MMPI scales, thus 


ted 
tea 


1 The present data were collec as part of Bu 
reau of Medicine and Surgery Project Number NM 
18 01 09.1. The opinions expressed are those of the 
author and are not to be construed as being official 
or in any way representative of the United States 
Navy. 

2 Now at United States Naval Personnel Research 
Field Activity, San Diego, California 

The author wishes to gratefully acknowledge John 
A. Most, Medical Corps, USN, for his invaluable as- 


sistance in many phases of the present investigation. 
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suggesting an association between certain 
measurable personality traits (particularly 
Hypochondriasis, Depression, and Hysteria) 
and physical disability. This association 
seemed to hold true for disabilities often 
thought of as psychosomatic in origin, as well 
as for disability groups, such as those having 
gunshot wounds and flat feet, not often 
thought of as being associated with personal- 
ity characteristics. As the author notes, the 
problem of the distinction between predisposi- 
tion and reaction reflected in the test scores 
of these groups was insoluble. 

The problem area of the relationship be- 
tween sociometric ratings and frequency of 
dispensary visits has also been considered. 
French (1951) found that frequent sick bay 
visits among naval recruits bore a significant 
negative relationship to peer nominations 
based on a friendship criterion (acceptability 
as a liberty companion) but found no rela- 
tionship with peer nominations of leadership. 
Izard and Manhold (1954) and Izard (1959), 
however, reported a significant negative rela- 
tionship between frequency of sick bay visits 
and peer nominations of leadership in a group 
of naval aviation cadets. They also found that 
those Ss judged to be in a psychosomatic 
classification had a lower mean sociometric 
leadership score. Wellingham (1959) found a 
similar relationship between dispensary visits 
and leadership ratings among aviation cadets. 
He also obtained significant negative relation- 
ships between sick call frequency and three of 
seven academic courses included in the naval 
aviation preflight training program, as well as 
with measures of physical training, instruc- 
tors’ ratings of officer-like qualities, and final 
grades in flight training. 

The present article examines the relation- 
ship between sick call frequency and socio- 
metric ratings of pilot proficiency, officer-like 
qualities, and social acceptability in an opera- 
tional air group, and also examines the rela- 
tionship of sick call frequency to objective 
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personality tests which do not rely primarily 
on the self-report technique. It was felt that 
a personality test of the conventional type, 
such as the MMPI, would not yield unequiv- 
ocal results in this context, because such ques- 
tionnaires use questions relating directly to 
health in order to infer standing on personal- 
ity traits. 

Ordinarily it is probably quite justifiable to 
use responses to health questions to infer per- 
sonality characteristics, such as is done in the 
case of the Hypochondriasis scale of the 
MMPI. For example, suppose 10% of all 
individuals in the general population respond 
positively to the question “I have a great deal 
of stomach trouble” (Hathaway & McKinley, 
1951), and only 1% of such persons have 
genuine organically determined stomach dis- 
orders. Then one will identify a hypochon- 
driac 9 times out of 10 on the basis of a 
positive response to this question. (A hypo- 
chondriac, for the purpose of this illustration, 
is a person complaining of stomach disorders 
who has no organic stomach trouble.) 

If one were to test only those persons who 
consult a doctor specializing in disorders of 
the stomach, however, it is conceivable that 
all of them might answer the same question 
positively. It is also conceivable that the vast 
majority of these persons suffer from organic 
stomach trouble. For this select population, 
therefore, positive response to this question 
would be a less appropriate indicator of hypo- 
chondriasis as defined for our illustration. 

It might be similarly inappropriate to ask 
questions about health of persons who fre- 
quent sick bay, and infer personality traits 
from their answers. The response “Much of 
the time my head seems to hurt all over,” 
particularly when made by a person who is 
often on sick call, will frequently, among 
other things, mean he has a psychosomatic 
disorder, or he is suffering from some other 
organic disorder, and perhaps less frequently 
that he is a hypochondriac. 

In view of these considerations it was felt 
that any significant correlations found be- 
tween personality or temperamental traits 
measured in objective tests and frequency of 
dispensary visits would be especially mean- 
ingful. The correlations would seem, to some 
extent, to obviate the problem arising in self- 
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report data of having to determine whether a 
reported concern over bodily function and dis- 
abilities arose from the disability or whether 
certain personality traits are to be found asso- 
ciated with a tendency to frequently seek 
medical advice. 


PROCEDURE 
Subjects 
The Ss were 81 Marine Corps officer helicopter 
pilots who were selected for the present project on 
the basis of availability during the testing periods 
These 81 Ss represent approximately one-third of the 
helicopter pilots then assigned to Marine Air Group 
26 and are considered to be representative of the 
larger pilot population. Since no single squadron 
comprising this air group was large enough to pro- 
vide the total sample, it was neceSsary to obtain Ss 
from five separate squadrons 
pilots having satisfactorily 


All Ss were qualified 
passed all selection and 
training requirements, and were assigned to helicop- 
ters in the present operational group. They ranged in 
age from 21 to 38 years with a years 
Educational level attained ranged from 11 to 18 
years of school completed, with a mean of approx- 
imately 15 years. Seventeen percent had gone only as 
far as high school while 55% had completed four or 
more years of college. Ss ranged in from 2nd 
Lieutenant to Major, but the 
imately 77%) were Ist or 
from health records in 


mean of 25 


rank 
majority (approx- 
Data 
visit to the in- 
treatment is indi- 
cated, were available for only 55 of the 81 Ss, so 
that all correlations with sick call 
based on the smaller N of 55 


2nd Lieutenants 


which each 


firmary, each diagnosis, and each 


irequency aré¢ 


Tests 


The personality measures used 
those comprising Cattell’s (1957a) Objective-Analytic 
Personality Test Battery (O-A Battery). The bat 
tery, as administered for the investigation 
consists of and test 
which are combined in such a as to yield 12 
individual factor scores. For definitions of specific 
factors in the O-A Battery the reader is referred to 
Cattell (1957b). Tests administered over a 
period of a day-and-a-half along with some standard 
personality inventories 


objective were 


present 


some 68 separate tests 


scores 


way 


were 


Index of Sick Call Frequency 


For the sample of 55, all visits to sick bay were 
recorded and a total count was obtained for each S 
Since the pilots in the present investigation had 
varying lengths of military experience, the total 
count was then divided by the number of months 
between the first entry in the record form and the 
date of the present survey. This yielded an average 
number of visits per month for each S based on a 
3 The validities of these personality inventories are 
considered elsewhere (Knapp & Most, 1960). 
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1 


period of at least 7 months but not more than 3 


years 


Nominations of Pilot Proficiency, Officer-Like 
Qualities, and Social Acceptability 


The sociometric measures used in the present in- 
vestigation were peer nomination forrms which car- 
ried brief definitions of the three qualities under 
consideration. Ss were given a list of names of only 
those in the investigation who were from their own 
squadron, from among which each S was asked to 
pick the top 25% and the bottom 25% for each of 
the three variables being rated. Each time a given 
individual’s name appeared in the top quarter a 
score of +1 was given and in the bottom quarter, 

1. Thus, every person received a score on each 
variable, the possible range of which was a positive 
to negative number corresponding to‘the number of 
Ss in the particular squadron 

It was not possible to have everyone in the study 
select from a total list of all other individuals, since 
not all of the Ss would have had an opportunity to 
know all other Ss represented. Therefore, nomina- 
tions were made only from a list of names supplied 
within each squadron. Once scores were obtained for 
each § within his squadron, it was possible to select 
a constant by which all scores within a squadron 
could be multiplied to yield scores comparable from 
one squadron to another. Thus, an S nominated by 
all individuals in one squadron as being top in a 
trait would receive approximately the same score as 
another S picked by all individuals of another squad 
ron, even though the number of Ss within the 
squadrons was different. The instructions and trait 
definitions for use in the peer nominations were 
given to the participating Ss on separate forms and 
are presented The appropriate N for each 
squadron appeared where bracketed percentages are 


below 


shown below. 


Pilot proficiency. From this list of names of 
pilots in your squadron, we would like you to pick 
out (a) the [25%] whom you consider the most 
proficient pilots (considering all the factors that 
are generally thought of as going to make up a 
“professional” naval aviator), and (b) the [25%] 
whom you would rank as the least proficient of 
this particular group. You need indicate no prefer- 
ence or rank within these two groups of [25%] 
Your answers will be confidential, and you 
not sign your name 


need 
Please note that this ranking 
system applies only to the group of people named 
here. You are not necessarily implying that those 
you picked as “most proficient” are the best you’ve 
ever flown with—nor that those whom you picked 
“least proficient” have any deficiencies at all. You 
are merely indicating their comparative standing, 
in your opinion, within this group 

Officer-like qualities. Next, rank the members of 
this group, in the same way, as to the presence of 
those qualities which are ordinarily thought of as 
“officer-like qualities.” In other words, pick the 
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[2590] who are (a) considered by 
highest in military proficiency and officer-like 
qualities, and (b) the [25%] who would rank 
lowest, within this group, in the same qualities. 

Social acceptability. Lastly, rank the members of 
this group, in a similar manner, as to: (a) the 
[25%] who seem to “fit in” best with the squad- 
ron as a whole, in other words those 
tribute the most, by reason of their personality 
and general disposition, to harmony and good 
feeling; and (b) the [25%] whom you consider to 
rank the lowest in this same category 
of help if you would try to pick these men as 
though you were expressing preference among 
them, as to those whom you would most like to 
have as a companion on an extended cruise (purely 
from the standpoint of personality and social ac- 
ceptability) and those whom you would least de- 
sire under the same circumstances. 


you to be 


who con 


It may be 


RESULTS 


The range of mean number of sick call 
visits per S was zero to 2.60 visits per month. 
The mean number of visits per month for all 
55 Ss was .64 with the standard deviation 
being .57 visit per month. 

Product-moment correlations between the 
objective personality test factors and sick call 
visits are shown in Table 1. Significant cor- 
relations were obtained against three of the 
12 O-A Battery factors. To more clearly 
understand these associations, and also to 
present information to others working in the 
area of objective test development, the sick 
call index was correlated with each individual 
test in the three factors where over-all signifi- 
cance was obtained. These correlations are 
presented in Table 2. 


The complete matrix including personality 


variables against which significant correla- 
tions with the sick call index were obtained 
is included in Table 3, as are r’s with other 
relevant test and life behavior indices. 
Through examination of Table 3, it will be 
noted that high positive correlations were ob- 
tained between frequency of sick call visits 
and objective test factors UI (Universal In- 
dex) 16 and UI 22. Persons scoring high on 
UI 16 have been depicted (Cattell, 1957b, p. 
237) as quick, not easily upset, more re- 
strained, more critical, and more determined 
and effective in their actions. They have been 
further characterized as being insuggestible 
and as tending to emphasize personal and 
esthetic values. High UI 22 scores are held to 
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TABLE 1 


CORRELATIONS BETWEEN Q-A PERSONALITY TEST BATTERY SCORES AND 
FREQUENCY OF Sick CALL VIsITs 


Universal 
Index 
Number 


16 
17 
18 
19 


20 


Corticalertia 
Neural Reserv 
Anxiet 
Realism vs 
Self-Sentiment 


Apathy 


n \ 
05 
O1 le 
001 | 


be indicative of “an alert, eager, controlled, ical reactions are characteristic of persons 
contact with external events” (Cattell, 1957b, high on this factor. 
p. 251). High cognitive fluency, much speed It will also be noted in Table 3 that corre- 


in simple mental processes and bold, uncrit- lations of pilot proficiency (PP), officer-like 


rABLE 2 
CORRELATIONS BETWEEN SicK CALL FREQUEN 
IN THREE O-A Battery | 


ts perceived in 
High ideomotor spee« 
High accur: 


Ive 


High speed alternating perspec 


to state assum 
d mischievous hun 
1 possible for others in giver 
oportion ol fluency on self 
rtion of fluency on dreams 


ratio initial to final performance on CMS tes 


ney to perceive many threatening objects in ur 
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TABLE 3 


INTERCORRELATIONS AMONG 


Variable N UI 22 


UI 16 
UI 22 
UI 26 
rr 
OLQ 
SA 


he .05 level 
e .O1 le 
» .001 level. 


qualities (OLQ), and social acceptability 
(SA) with sick call visits were significantly 
negative. 

DISCUSSION 


The mean number of monthly dispensary 
visits per S in the present study was .64. 
Izard (1959) reports a mean number of dis- 
pensary visits of 10.84 for their psycho- 
somatic cadet group and of 6.53 for their 
nonpsychosomatic group over an 8-month 
period, which would represent a mean of 
approximately 1.35 and .82 visits per month 
for the two groups, respectively. Since these 
latter figures are from a cadet population and 
since frequency of dispensary visits has been 
shown to be negatively related to over-all per- 
formance in flight training (Wellingham, 
1959), the lower obtained mean number of 
dispensary visits for the present group of 
successful trainees was expected. 

It was found that frequency of sick bay 
visits was significantly correlated with three 
of the O-A Battery factors. The association 
was particularly high with UI 16 (see Table 
1), the obtained r being .57. From the exam- 
ination of the kinds of tests represented in 
some of the O-A Battery factors (especially 
UI 16), it might be hypothesized that intelli- 
gence, or an intelligence related element of 
UI 16, is contributing to a portion of the 
variance accounted for by these factors. Cat- 
tell, Knapp, and Scheier (in press) have pre- 
sented data from the second-order factoriza- 
tion of five studies indicating that first-order 


THE COMPONENTS OF A HYPOTHETICAL CLUSTER OF VARIABLES 


Sick 
Call 


OLQ SA Educ 


10*** 


oor 
il 
390** 


—.16 


Factors UI 16 and UI 1 (Intelligence) both 
load, in a positive direction, a second-order 
factor which they have termed “Expansive 
Ego-vs.-History of Difficulty in Emotional 
Problem Solving.” Thus, were a portion of 
the UI 16 variable actually measuring intelli- 
gence it might be hypothesized that it is this 
contributor which is accounting for the pre- 
dicted variance in the obtained correlation 
coefficients. In order to further investigate the 
possible contributing factors, two additional 
variables, intelligence as measured by the 
Aviation Qualification Test (AQT) (Ambler, 
1955) and educational level, were introduced 
into the matrix. The obtained r’s are included 
in Table 3. Although the correlation between 
intelligence and UI 16 was significant, only a 
small part of the UI 16 variance would ap- 
pear to be accounted for by the intelligence 
factor. Furthermore, the nonsignificant intel- 
ligence-sick call r shows that virtually none of 
the sick call variance is accounted for by the 
intelligence factor as measured by the AQT. 
Thus, it appears that intelligence is not re- 
lated to frequency of sick calls in the present 
sample. 

Examination of Table 3 will show that 
several relationships which have been ob- 
tained in other studies were further supported 
by the present data. First, the sociometric 
indices seem to be decidedly lower for those 
who visit sick bay frequently..Secondly, ed- 
ucation and intelligence seem to be unrelated 
to frequency of sick call visits in this highly 
selected group. It was also demonstrated that 
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objectively measurable personality character- 
istics appear to be significantly related to sick 
call frequency in the present male population. 
UI 16 had the highest correlation with the 
sick call criterion, its validity accounting for 
an estimated 32% of the variance. The nega- 
tive correlation between UI 16 and pilot 
proficiency ratings was also significant. Since 
the present personality tests do not infer 
personality traits from self-reported concern 
over health, it is suggested that the obtained 
correlations indicate a true relationship be- 
tween the personality traits measured and 
sick call frequency, rather than being arti- 
facts of the personality testing technique 
used. 

It was also found that those persons fre- 
quenting sick call were, on the average, rated 
lower sociometrically. It is impossible from 
the present data to assess whether (a) those 
visiting sick bay often, for whatever reason, 
are rated lower because of this or; (3) 
whether those misfitted to the group tend to 
frequent sick bay or; (c) whether this rela- 
tionship is the result of some third factor such 
as UI 16. In any event, those high on O-A 
Battery Factors UI 16 and UI 22 and, to a 
lesser extent, those low on UI 26 tend to fre- 
quent sick bay and tend to be rated less 
proficient as pilots. 

Using the factor definitions presented by 
Cattell (1957b), the significant personality 
relationships suggest that those visiting sick 
bay more frequently are (from UI 16) char- 
acterized by fast, determined, effective action; 
are comparatively insuggestible; and tend to 
emphasize personal and esthetic, or cultured, 
values. They tend (from UI 22) to be high in 
cognitive fluency and may be characterized 
by much speed in simple mental processes 
and by bold and uncritical reactions. They ap- 
pear (from UI 26) to lack a desire for self- 
control and carefulness. 

From an examination of the test and factor 
correlates of sick call frequency, one might 
speculate that the individual frequenting sick 
bay is highly ego involved with intellectuality. 
He perhaps spends a great deal of time with 
self-relevant ideas and ascribes a high level of 
importance to his own worth, values, and 
health. His concern for following the rational, 
objective, “intellectual” approach leads him 
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to sick call with symptoms which other, less 
self-involved persons, may tend to gloss over. 


SUMMARY AND CONCLUSIONS 


A battery of objective personality tests, the 
Objective-Analytic Personality Test Battery, 
was administered to 81 Marine Corps officer 


helicopter pilots, and the 12 resulting factor 
scores were correlated with an index of fre- 
quency of sick call visits. Sick call visits were 
also correlated with sociometric rankings of 
pilot proficiency, officer-like qualities, and 
social acceptability in their squadron. Three 
of the 12 correlations between personality 


factors and sick call frequency were signifi- 
cant at the .05 level or better. All correlations 
between sociometric rankings and frequency 
of dispensary visits were significant and nega- 
tive in direction. 

The obtained relationships suggest the fol- 
lowing conclusions: 


1. Certain objectively measurable personal- 
ity traits are related to frequency of sick bay 
visits in the present population. Significant 
positive correlations were obtained between 
frequency of sick call visits and personality 
Factors UI 16 and UI 22. A significant nega- 
tive correlation between UI 26 and sick calls 
was also obtained. By using the factor inter- 
pretations presented by Cattell 
1957b) it was suggested that those frequent- 
ing sick bay would be characterized by fast, 
determined, effective action and by a tendency 
to emphasize personal and esthetic values. 
They are depicted as high in cognitive flu- 
ency, displaying high speed in mental proc- 
esses, and they tend to be bold and uncritical 
in their reactions, lacking a desire for self- 
control and carefulness. 

2. Sociometric ratings of pilot proficiency, 
officer-like qualities, and social acceptability 
were significantly related to frequency of sick 
call visits, with those having the lower socio- 
metric ratings ‘being the most frequent sick 
call visitors. 

3. Educational level and intelligence were 
found to be unrelated to frequency of sick 
call visits and to sociometric ratings. 


(1957a, 


Since the present relationships were based 
on objective test data rather than self-report 
data, it was hypothesized that stable, primary 
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personality, and temperamental traits may be 
the underlying factors related to the tendency 
to seek frequent medical consultation and to 
lower sociometric ratings. It is doubtful that 
either low social acceptability or frequent sick 
bay visits, for whatever reason, would effect 
scores on O-A Battery tests, as might be the 
case with questionnaires such as the MMPI. 
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DYNAMIC VISUAL ACUITY AS RELATED 


TO AGE, 


SEX, AND STATIC ACUITY 


ALBERT BURG 


Institute of Transportation and Traffic 


“Dynamic Visual Acuity,” or “DVA,” as it 
is conveniently abbreviated, is the term used 
to designate the ability of an observer to dis- 
criminate an object when there is relative 
movement between the observer and the ob- 
ject. The term was originated by Elek Lud- 
vigh and James W. Miller, and their pioneer- 
ing study of the problem is reported in a 
of research articles (Ludvigh, 
Ludvigh & Miller, 1953, 1954a, 1954b, 1955; 
Miller, 1956a, 1956b; Miller & Ludvigh, 
1953, 1955, 1956). Ludvigh and Miller found 
that the correlation between dynamic acuity 
and static acuity (both tested monocularly) 
was very low, and that training did not alter 
this correlation in any way. Additional find- 
ings were that (a) acuity for a moving target 
deteriorates markedly and progressively as 
the angular velocity of the target increases; 
(6) DVA performance can be improved both 
through practice and through increased target 
illumination; and (c) the above findings 
apply substantially whether the plane of tar- 
get movement 


series 1953; 


is horizontal or vertical, or 


whether the target is moving with the subject 
stationary or vice versa. 


When the present authors began their in- 
vestigation of DVA, it was with several pur- 
poses in mind. First, an independent verifica- 
tion of Ludvigh and Miller’s results appeared 
desirable, at least insofar as the relationship 
between DVA and static acuity was con- 
cerned. In this connection it deemed 
important to establish normative performance 
data for a more heterogeneous group of sub- 
jects (Ludvigh & Miller used naval aviation 
cadets) and also to alter their test procedure 
to permit somewhat easier generalization to 
normally encountered visual tasks. 


was 


A second aim was a study of the relation- 
ship, if any, between DVA and other visual 
measures such as critical flicker frequency 
(CFF) and lateral phoria (as stated in terms 
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of the ACA ratio’), 
factors, namely, age 


well as nonvisual 
and sex. Also to be 
studied was performance with the subject’s 
head fixed in Ludvigh & Miller’s re- 
search), as opposed to that with the head per- 
mitted to rotate freely. 


as 
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The subject was then seated in the DVA apparatu 
and his binocular static acuity again measured, this 
time using the 15 projected targets. The targets 
presented in sequence, from largest to smallest, and a 
shutter to 
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TABLE 1 


PRODUCT—-MOMENT CORRELATIONS 


DVA 
Speed 


second N Correlation 


Free-Head 
20 5 .3061 
60 .2798 
90 ‘ .2353 

120 .2107 
150 : .1695 
180 : .1762 


Fixed-Head 
60 
90 
120 
150 


.2201 
1411 
4713 
.1290 


BETWEEN 


Ortho-Rater Static Acuity 


(BINOCULAR) STATIC AND Dynamic Acuity 
Screen Static Acuity 


Correlation 


5238 
.6342 
.2674 
.4601 
.1406 
4136 


3509 
.1444 
.3202 
.0734 
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each target to approximately 1 second. For each 
target size, the position of the checkerboard was 
different (randomly so) from that in the Ortho- 
Rater binocular test, and was different in each suc- 
ceeding test 

The next step was to determine the subject’s DVA 
threshold with the targets moving across the screen 
from left to right. Each subject was tested (binoc- 
ularly) at four speeds, and a total of six angular 
velocities were used during the course of the research 


Fic. 1. Dynamic visual acuity test apparatus, show- 
ing subject’s head position fixed by means of a bite 
board. 


(20, 60, 90, 120, 150, and 180 degrees/second). For 
each test speed the subject was first shown a practice 
target to familiarize him with the speed and path of 
the targets to follow. DVA tests always were given 
in the order of increasing target speed, and for each 
test, target size always progressed from largest to 
smallest. The subject was required to call out “top,” 
“bottom,” “left,” or “right” for each target as it 
moved across the screen until the combination of 
target size and velocity prevented his being able to 
discriminate the position of the checkerboard. Each 
target made only one sweep across the screen, the 
projector carriage shiftang automatically to the next 
slide after the target left the screen. Scoring followed 


TABLE 2 
TEST-RETEST RELIABILITY COEFFICIENTS 


Test 


CFF 

ACA Ratio 

Ortho-Rater Static Acuity 
Screen Static Acuity 


8079 
8217 
.7359 


5290 


DVA Free-Head | 
60 
90 
120 
150 
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DVA Fixed-Head (° 
60 
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the same criterion as for the Ortho-Rater, i.e., the 
subject’s score is the number (from 1 to 15) of the 
last correctly discriminated target preceding two con- 
secutive misses. With the exception of the bite board, 
the procedure for fixed-head subjects was the same 
as for free-head subjects. In every acuity test, static 
or dynamic, the subject was urged to guess at the 
answer when he was not sure. This tends to over- 
come obvious individual differences in preference for 
guessing. 

Subjects were randomly assigned the head-fixed or 
head-free condition, and of those subjects retested, 
some were randomly chosen to be tested under the 
alternate head condition, while the others were re- 
tested under the same condition. 


Subjects 

The research has been in progress for the past 3 
years. To date, a total of 236 subjects have been 
tested (110 males and 126 females), and 96 of these 
subjects were tested at least twice. The age range 
was from 16 to 67, with the great majority (79%) 
falling in the 16-25 year age group. Since the, au- 
thors’ research interests lie primarily in the areas of 
transportation and driver characteristics, the subjects 
were chosen so as to permit generalization to a popu- 
lation of drivers. All subjects were drivers, and were 
required to have a (monocular) static acuity of at 
least 20/40 corrected (as is the case in the state of 
California). Static acuity ranged from 20/40 to 
20/13. The subjects were a volunteer group of stu 
dents and employees at the University of California, 
Los Angeles, and due to normal attrition to both 
groups, only 96 were readily available for retesting 
The time between test and retest ranged from 2 days 
to 18 months, with a mean of about 8 months 


RESULTS 


1. As reported in a recent publication 
(Burg & Hulbert, 1959), no correlation was 
found between CFF and ACA ratio, or be- 
tween either CFF or ACA ratio and either 
static or dynamic acuity. 

2. Low but significant correlations were 
found between DVA and Ortho-Rater static 
acuity, these correlations decreasing with in- 
creasing target velocity and being generally 
lower and less consistent in trénd for fixed- 
head DVA than for free-head DVA. The same 
general pattern is evident in the correlations 
between DVA and static acuity as measured 
on the screen, except that these tend to be 
higher. Table 1 presents the specific values. 

3. Test-retest reliability was determined 
for all tests used and the results are presented 
in Table 2. It will be noted that with the 
exception of those for the fixed-head DVA 
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TABLE 4 
VisuaL Acuity EQUIVALENTS OF 
OrRTHO-RATER SCORES 


Ortho 
Rater 
Score 

(monocular) 


Visual 
Angle 
(minutes 
of arc) 


Snellen 
Rating 


20/200 
20/100 
20/67 
20/50 
20/40 
20/33 
20/29 


20/25 


mkt wd 


1a 


oo - 


20/22 
20/20 
20/18 
20/17 
20/15 
20/14 
20/13 


tests, the coefficients are all statistically sig- 
nificant and respectably high. 

4. Table 3 presents a summary of the mean 
static and free-head DVA scores categorized 
by age and sex. There is a consistently better 
performance for the male subjects for each 
test, when age groups are lumped. While some 
of these differences are not statistically sig- 
nificant, those for the screen static acuity and 
three of the six DVA speeds are significant at 
the .002 level or better (¢ test). To permit 
clearer interpretation, Table 4 shows the vis- 
ual angles and Snellen ratings corresponding 
to the Ortho-Rater scores used in Table 3. 

Due to the small number of subjects in the 
higher age brackets, no generalizations can be 
made about differential DVA or static acuity 
as a function of age. 


DISCUSSION 


In general, the results of this research sup- 
port the findings of Ludvigh and Miller. Both 
studies reveal a progressive decrease in acuity 
for a moving target as target velocity in- 
creases. The fact that the present study found 
low but significant correlations between static 
acuity and DVA, while Ludvigh and Miller 
did not, is undoubtedly due to the high degree 
of similarit, between our static and dynamic 





Dynamic Visual Acuity 


targets, as well as the more heterogeneous 
subject population we used. Ludvigh and 
Miller used Snellen ratings for static acuity 
and a series of Landolt rings for their (mon- 
ocular) DVA test, and their subjects were 
young healthy males with 20/20 vision or 
better (uncorrected). Thus, the restricted 
range of static acuity represented in their test 
population would quite naturally tend to 
reduce the correlation between static and 
dynamic acuity. While the true magnitude of 
the correlation between the two is still uncer- 
tain, the inevitability of their interrelation is 
not. Quite obviously, a person who is blind, 
or nearly so, will always do poorly in both 
static and dynamic acuity tests. 

Although considerably more representative 
of the general population in this respect, the 
present group of subjects also is limited in its 
range of static acuity (from 20/13 to 20/40 
corrected). While this again serves to under- 
emphasize the true correlation between static 
and dynamic acuity, these results are of con- 
siderable value in generalizing to practical 
situations. For example, it is the authors’ be- 
lief that dynamic visual acuity may be at 
least as important as static acuity in perform- 
ing the task of driving an automobile, and the 
study reported here represents the initial 
phases of a research project to investigate 
this hypothesis. Consequently, subjects were 
chosen (drivers, with static acuity of 20/40 
or better in one eye) so as to maximize their 
usefulness in the continuing research pro- 
gram, in which it is intended to use as many 
of the same subjects as are available. This 
would not be possible if a random sample of 
the general population had been used. 

The results clearly indicate that factors 
other than static acuity play an important 
part in determining an individual's ability to 
discriminate a moving object. Even when cor- 
rected for attenuation the highest correlation 
between Ortho-Rater static acuity and DVA 
is only .394. In addition, the importance of 
static acuity as a determiner of DVA becomes 
progressively less as target velocity increases. 
There is much room for speculation as to the 
precise nature of the nonacuity factors, but 
they reflect the efficiency of integration of the 
entire oculomotor system, as well as nonvisual 
factors such as attention, differential practice 


115 


effect, and experimental error. Research cur- 
rently underway on eye movement analysis 
and visual pursuit efficiency should shed con- 
siderable light on this problem area. 

Static acuity measured on the screen cor- 
related more highly with DVA than did 
Ortho-Rater static acuity. This would be ex- 
pected as the testing situations were more 
similar in that the screen static test and DVA 
tests employ the same target slides and also 
duration of exposure is limited in both in- 
Fixed-head DVA correlated with 
static acuity to a lesser extent than did free- 
head DVA. This is not surprising since the 
limits of angular 
permit foveal vision over an arc of perhaps 
90°-100°, while the combination of head and 
eye movements permits the subject to fixate 
the target over the full 18 Thus, 
fixing the head reduces the time during which 
the subject can view the target clearly. 

With regard to test reliability it should be 
explained that during the course of the study, 
no attempt was made to equate all test-retest 
conditions for 


Stances, 


rotation of the eye alone 


sweep. 


each subject. For example, 
factors such as time of day (and correspond- 
ing fatigue and/or eyestrain), level of motiva- 
tion, interval test 
person conducting the experiment were not 
Three different 
turns serving as experimenter and 
all were thoroughly trained, there may have 
been sufficient inter- and intra-experimenter 
variability to affected the 


between and retest, and 


equated. individuals took 


although 


have adversely 


reliability results. It is reasonable to expect 


even higher reliability coefficients if care is 
taken to prevent these possible sources of 
variation. Also, it appears likely that testing 
a large number of subjects under fixed-head 
would reveal the 
consistency of results that prevailed for the 
free-head condition. 

As for age and sex differences in perform- 


conditions serve to same 


ance, no age related differences appeared, and 
generally higher scores resulted for males 
than for Neither ex 
“logic” nor previous research offers any clear- 
cut explanation for the latter result. It is 
conceivable, for example, that as a 


females. post facto 


conse- 


quence of the experimental situation, the 


general level of motivation was lower for the 


female subjects. However, there is no proof 
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that this was the case, and this is a problem 
which merits further investigation. Also, 
additional performance data from subjects in 
the higher age brackets are necessary before 
any conclusions can be drawn concerning pos- 
sible differential age effects in DVA. 


SUMMARY 


The results of this research clearly indicate 
that a person’s ability to discriminate a 
moving target cannot be predicted adequately 
from his static acuity, and that the adequacy 
of this prediction decreases as the speed of 
the moving target increases. 

The exact nature of those factors other 
than static acuity that influence dynamic 
acuity are not yet known, but it is probable 
that they involve the efficiency of the entire 
oculo-motor system. 

No relationship was found between dy- 
namic visual acuity (DVA) and either 
critical flicker frequency or lateral phoria 
(ACA ratio). Also, the small number of 
subjects in the higher age brackets makes 
impossible a generalization as to the effects 
of age on DVA performance. Finally, the 
results suggest a consistent and significant 
difference in performance between male and 
female subjects, the latter performing less 
adequately. 

It is suggested that testing of a large 
number of additional subjects of both sexes 
and of all ages will serve to correct the several 
inconsistencies appearing in these results, but 
it is not expected that the basic conclusions 
will be significantly altered. Once having 
established DVA as a relatively independent, 
reliable measure of visual ability, the next 
step becomes the study of the relationship 
between DVA and performance in a variety 
of tasks where discrimination of moving 
objects plays a key role, such as in driving, 
flying, ball playing, and the like. Studies are 
currently underway toward this end. 
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WELSH’S INTERNALIZATION RATIO 
AS A BEHAVIORAL INDEX 


BERNARD J. FINE 


Quartermaster Research and Engineering Center, Natick, Ma 


Welsh (1956) has presented an Internal- 
ization Ratio (IR) which attempts to express 
in simple quanitative terms some of the inter- 
relationships between certain of the MMPI 
scales concerned with mood and behavior. 

The IR is derived by relating the “mood” 
scales to the “behavior” scales of the MMPI 
as described by the formula 


_As+ D+Pt 

~ Hy + Pd+Ma 
where Hs, D, Pt, Hy, Pd, and Ma represent 
the hypochondriasis, depression, psychas- 
thenia, hysteria, psychopathic deviate, and 
hypomania scales in that order. The IR is 
defined so that it yields a theoretical value 
of 1.00 in “normal” cases. Individuals who 


tend to “internalize” their problems, who 
experience somatic symptoms and subjective 
feelings of stress, would be expected to obtain 


IR scores above 1.00. Those who “externalize” 
their conflicts would be expected to score 
below 1.00. 

It would be expected that most individuals 
who get into trouble or give trouble to others 
either verbally by griping, cursing, and the 
like, or physically by fighting, engaging in 
criminal activities and so forth, would be 
extreme “externalizers.”” This assumption is 
substantiated by data presented by Welsh 
(1956), derived from various studies, which 
indicates that whereas groups of college 
students range from .90 to .96 in mean IR 
scores, groups of delinquents and prison 
inmates range from .84 to .89. 

In certain experimental situations, it is 
desirable to eliminate the so-called “bad 
actors” from among the test subjects. For 
reasons of economy and administration of the 
experiment it would be helpful if the indi- 
viduals could somehow be identified before- 
hand. The research reported here investigates 
the effectiveness of the IR in distinguishing 
between “satisfactory” and “unsatisfactory” 
test subjects. 


achusetts 


Stupy I 
Method 


Subjects. The subjects were 34 enlisted men who 
had been part of a test subject pool at a military 
installation at some time during the period extending 
from early 1957 to early 1959 but who were no 
longer in the pool. The 34 men constituted the total 
population of men whose service as test subjects 
had been terminated during the 
period. Of the 34 subjects, 6 were discharged for 
medical reasons and are not considered here. The 
remaining 28 subjects constituted the population for 
this study. 

Procedure. During the course of their terms as test 
subjects, all subjects had completed the MMPI as 
part of a research program devoted to determining 
relationships between personality variables and indi- 
vidual responses to environmental stress 

From independent records kept at the installation, 
it was determined that 13 of the 28 subjects were 
honorably discharged from the Army while still 
serving in the capacity of test subjects, having fully 
completed their tours of duty. These men were per- 
forming satisfactorily as test subjects right up until 
their time of discharge. They will be referred to as 
the “satisfactory” group. The remaining 15 subjects 
had been dismissed as members of the test subject 
pool and reassigned to other duties because of un- 
satisfactory conduct on or off duty while assizned to 
the pool. Unsatisfactory defined as fist 
fights, auto theft, excess profanity on duty, drunken- 
ness, refusal to perform duties required of them on 
test, and other behavior generally intolerable of test 
subjects in the test subject pool. This group will be 
referred to as the “unsatisfactory” group. 

It was predicted that the mean IR for the satis- 
factory group would be significantly higher than the 
mean IR for the unsatisfactory group 


aforementioned 


conduct is 


Results 


IRs were calculated for the 28 subjects. 
The mean IR for the satisfactory group was 
.91; the mean IR for the unsatisfactory group 
was .82. The difference between the two 
means is statistically significant (¢ = 2.44, 
p < .025, 1-tailed test), indicating verifica- 
tion of the prediction. 

1The decision to dismiss subjects as members of 
the test subject pool was made by persons unaware 
of the performance of the subjects on any psycho- 
logical test 
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TABLE 1 
CLASSIFICATION OF SUBJECTS IN THE SATISFACTORY AND 
TO WHETHER 
ABOVE .87 


UNSATISFACTORY GROUPS 
OR Not 


ACCORDING 
Herr IR Scores FEL! 


IR IR ator 
bove .87 below .87 
Test subject pool 
Satistactory subjects 


Unsatisfactory subjects 


Field exercise group 
Satisfactory subjects 


Unsatisfactory subjects 


Inspection of the IR scores indicated that 
the IR would discriminate most effectively 
between the satisfactory and unsatisfactory 
subjects if .87 was established as a “cut off” 
point. Table 1 the 
subjects in the satisfactory and unsatisfactory 
according to whether or not their 


shows classification of 
groups 
scores fell above .87. 

Since the IR was adopted as part of the 
test battery for the selection of test subjects 
for the test subject pool shortly after this 
study was completed, it has not been possible 
to validate the .87 cut off point on this 
population. However, a second study, de- 
scribed below, provided an opportunity for 
further investigation on a similar population. 


Stupy II 
Ve thod 


Subje The subjects were 15 enlisted men and 
two officers who participated in a 6-week field exer 
region during the summer of 195! 

Procedure. Prior to leaving for the field exercise, 
all of the subjects completed the MMPI. The MMPI 


the men for 


cise in a subarctic 


data were not used in the selection of 


the field exercise. IRs were calculated for all of the 
ubjects 
Upon completion of the field exercise, five civilians 
who had accompanied the subjects on the trip, acting 
is supervisors and observers, were asked to rank the 
17 subjects in the order in which they would prefer 
them back if they had to go through the 
same field The judges had had 
ample opportunity to observe the subjects both on 
ind off field 
The subjects who received the nine highest average 
rankings, based on the preferences of the five judg: 
were termed the 
jects who received the eight lowest average rankings 
were termed the 


to have 


study once again 


duty during the exercise 


“satisfactory group,” and the sub 


“unsatisfactory group.” 


It was predicted that the satisfactory group would 
have a significantly higher mean IR than the un- 
satisfactory group and that significantly more sub- 
jects having IRs above .87 would fall into the satis- 
factory category than would subjects having IRs at 
or below .87. (Conversely, it was expected that more 
subjects having IRs of .87 vould fall into 
the unsatisfactory category than would subjects hav- 
ing IRs above 


or below 


87.) 


Results 


In order to determine the consistency be- 
tween the judges in their rankings of the 
subjects, Kendall’s coefficient of concordance 
W was computed (Siegel, 1956, p. 229-238). 
The resultant W of .706 was transformed into 
chi square (Siegel, 1956, p. 236) yielding a 
x° of 56.48 (16 df) which is statistically sig- 
nificant (p < .01) indicating a high degree of 
consistency between the judges in ranking the 
subjects. 

The mean IR score for the satisfactory 
group was .98 and for the unsatisfactory 
group, .85. This difference is statistically 
significant (¢ = 2.05, p = .03, 1-tailed). 

Table 1 shows the distribution of subjects 
with respect to satisfactory performance and 
to the .87 cut off point. The difference shown 
in Table 1 between the satisfactory category 
and the unsatisfactory category is statisti- 
cally significant at the .05 level using Fisher’s 
exact probability test (Siegel, 1956, p. 96 
104). 


Discussion 


In both studies, the prediction that the 
satisfactory group would have a significantly 
higher mean IR than the unsatisfactory group 
was substantiated. In addition, the prediction 
that the .87 cut off point, derived from the 
data of the first study, would discriminate 
significantly between satisfactory and un- 
satisfactory subjects in the second study was 
verified. 

In general, it appears that the IR has merit 
as a behavioral index although more gen- 
eralizable data from a larger population is 
certainly necessary and desirable. 

The question may still be raised, however, 
as to the advantage of using the IR over any 


of the separate scales that compose it since 
some or all of them may discriminate as well. 
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Certainly, if one or more scales perform the 
same function individually as they do in 
combination (IR), then parsimonious pro- 
cedure would demand using the former. Ac- 
cordingly, the mean standard scores on each 
of the six scales used in deriving the IR 
were computed for the “satisfactory” and 
“unsatisfactory” groups in the first study. 
Only the Ma scale significantly discriminated 
between the two groups. Classification of 
“satisfactory” and “unsatisfactory” subjects 
according to their Ma scores using two cut off 
points yielding maximum discriminability re- 
sulted in the failure of the Ma scale to dis- 
criminate significantly between subjects on 
the satisfactory-unsatisfactory dimension. Ex- 
amination of the data indicated that the 
obtained significant difference between the 
two group Ma means was due to two or three 
extreme scores in one of the groups. 

An identical analysis of the mean scale 
scores for the satisfactory and unsatisfactory 
groups in Study II yielded essentially the 
same results. Here, the D and Pt scales sig- 
nificantly discriminated between the groups 
but neither scale approached significance in 
discriminating between individuals. Again the 
significance between the was appar- 
ently due to a few extreme scores. 

The IR shows consistency from Study I 
to Study II both in discriminating between 
groups and individuals. The individual scale 


means 


scores are inconsistent in discriminating be- 
tween groups and fail to discriminate between 
individuals. Clearly, then, the combination of 
the six scales into the IR yields something 
which the six scales taken individually do 
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not.2 Emphasis on the relationship between 
the scales rather than on the absolute values 
of individual scale scores appears to be 
warranted, in the case of Welsh’s Internaliza- 
tion Ratio. 


Summary 


Two studies are reported. In the first, 13 
men were classified as “satisfactory” and 15 
men as “unsatisfactory” test subjects by 
independent criteria. Welsh’s MMPI-derived 
Internalization Ratio (IR) __ significantly 
discriminated between the two groups. In 
addition, an IR of .87 was found to discrimi- 
nate maximally between individuals. 

The second study, using 17 men engaged in 
a field exercise, further validated the IR as 
an index of group desirability. Furthermore, 
the .87 cut off point again significantly dis- 
criminated between satisfactory and unsatis- 
factory individuals. 
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2 It has been pointed out and should be noted here 
that a combination of 
discriminant function 
form in this situation ) 
findings using the individually. However, 
unless such a procedure resulted in 
greater predictability over that 
the IR, the latter would 
convenient method 


scale scores such a in a 


analysis 


possibly 


could 
despite the lack of consistent 
scales 
significantly 
obtained here using 


appear to be the more 
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SOME 


ASPECTS OF ATTEMPTED, SUCCESSFUL, 


AND EFFECTIVE LEADERSHIP’ 
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The origins of leadership were examined in 
detail in Parts III, IV, and VI of Leadership, 
Psychology and Organizational Behavior 
(Bass, 1960). A variety of theorems can be 
derived from an examination of this discus- 
sion concerning the development and change 
of individuals in their attempts to lead, their 
success in influencing others, and the effec- 
tiveness of their leadership. 

The present paper describes tests of the 
empirical adequacy of three theorems: 


1. Successful leadership is related more to 
ability in effective compared to ineffective 
groups. 

2. Successful leadership is related more to 
esteem in effective compared to ineffective 
groups. 

3. Discrepancies between esteem and self- 
esteem are manifested in unsuccessful leader- 
ship. 


The logic of each theorem will be discussed 
in more detail as we consider the specific 
analysis developed to verify or refute the em- 
pirical adequacy of each theorem. 


SUBJECTS AND METHOD 


these experiments ROTC 
cadets meeting in 51 problem-solving groups of five 
men each. 

Before each of 10 brief discussions, each member 
ranked a list of five words according to the 
estimated familiarity (X) of the words. Following 
a discussion to reach a rank order acceptable to 
the group, members again privately reranked the 
five words (Y). The rho correlations between mem- 
bers’ initial and final rankings, the group decisions 
following discussion (G), and the correct rankings 
(R) based on mass survey data provided measures 
of leadership and effectiveness. These correct rankings 
were announced after discussion. Attempted 
leadership, regarded as time talked in these discus- 


Subjects of were 255 


each 


1 This paper was prepared for presentation at the 
sixty-eighth Annual Convention of the American Psy- 
chological Association, September 1960, Chicago, IIli- 
nois. Much of the data analysis was completed by 
Austin Flint, now with the Metropolitan Life Insurance 
Company 


sions, and total group participation were measured 
by clocks activated by voice.* 


Successful Leadership is Related More to 
Ability in Effective Compared to Ineffective 
Groups 


Available data made it possible to test the hy- 
pothesis that proficiency is associated more strongly 
with successful leadership in effective rather than in- 
effective groups. This hypothesis was suggested by 
reasoning that if the more able members of a group 
succeed as leaders rather than the less proficient, 
their leadership will be more effective and conse- 
quently the group will be more effective. By 
successful leadership we mean efforts to change the 
behavior of other members which result in such 
change. By effective leadership we mean changes 
wrought in the other members which are rewarding 
to them (or nonpunishing). Effective groups are 
rewarding groups. 

The argument proceeds as follows—first, we eluci- 
date elsewhere (Bass, 1960) the relation between 
successful leadership, effective leadership, and group 
effectiveness which in summary is: “Successful 
leadership must occur usually in order for groups to 
become more effective. Such leadership can ‘be de- 
scribed as effective leadership” (p. 133). 

Next, we relate ability and effective leadership 
summarizing the agreement by stating: “Who will 
be effective, if he is successful (as a leader) ?” 
(p. 139), “He who has the ability to solve the 
group’s problems” (p. 140). 

A further derivation leads to the proposition that 
a group will be more effective when its ablest mem- 
bers are given high status so that they are more 
likely to succeed as leaders: “the higher the congru- 
ence within a group between status and ability, the 
more effective the group is likely to be” (p. 337) 

These propositions fit with our corollary argument 
here that if we observe that the ablest members 
were indeed the successful leaders of a group, the 
group would be more effective than if less proficient 
members were most influential. 

The 51 groups of ROTC cadets, each evaluating 
10 problems, provided data concerning 510 discus- 
sions. These 510 discussions were divided into the 
255 most publicly effective discussions and the 255 
least publicly effective.s Then, the 1250 measures of 


2For details 
Bass, Flint, and Pryer 
and Flint (1958). 

ap 


these 
1957) 


concerning measurements, see 


and Bass, Pryer, Gaier, 


ublic effectiveness was the extent the group deci 
sion following discussion was more accurate than the 
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relative success as a leader (A) * drawn from the 255 
effective discussions among five members each were 
correlated with their scores for initial accuracy 
(RX).5 Public and private success as a leader were 
not used because total public or private successful 
leadership was found in another analysis to be higher 
in effective group discussions and lower in ineffective 
group discussions (Bass et al., 1957). On the other 
hand, by its construction, total relative success was 
always zero for any single discussion, per se 

For the 1250 measures of ability and leadership 
drawn from the 255 effective discussions, a highly 
significant product-moment correlation of 45 was 
found between any member’s initial accuracy (RX) 
and his subsequent relative success as a leader (A) 
When the same product-moment correlation between 
initial accuracy and relative successful leadership 
was calculated for the 1250 paired values of ability 
and leadership drawn from the less effective discus- 
sions, a coefficient of only .07 was obtained between 
initial accuracy and relative success as a _ leader. 
(This difference between .45 and .07, each based on 
1250 cases, was significant at the 1% level according 
to a ¢t test of the significance of the difference be- 
tween z coefficients although with this large df it is 
difficult not to obtain statistical significance. Pre- 
sumably because of the large df's, the reliability of 
the obtained coefficients of .45 and .07 is high enough 
to be looked at as a stable description of the effective 
and ineffective discussions and the question of sta- 
tistical significance is not too appropriate.) 

The analysis was repeated for private effective- 
ness.° A product-moment correlation of 48 was ob- 
tained between initial accuracy and relative suc- 
cessful leadership for the 1250 drawn from 
the 255 discussions more effective privately while a 
significantly lower corresponding correlation of .36 
was obtained for the 1250 scores drawn from the 
255 less effective discussions 


scores 


The data strongly support the argument that pro- 
ficiency of members is associated with their success 
as leaders more in effective in contrast to ineffective 
groups, particularly publicly effective in comparison 
to publicly ineffective ones. It would seem that 


Plato’s proposal to have philosopher-kings has merit 


average initial ranking by the members of the familiar 
ity of five words. Public effectiveness of a discussion 
is equal to pgr—- Prx; 

"Ar = Px, yi — Pxiy, 
is equal to the average rho correlation between a first 
member’s initial opinions (X,) and the other members’ 
final opi ions (Yj) rho correlation 
I (Y,) and 


between 
It can be shown 


Relative success as a leader 


minus the average 
the first member’s final opinions 
everyone else’s initial rankings (Xj) 
thatA = 0 

5 For convenience the symbols RX refer to the rho 
correlation between a rank order like X, the initial 
rankings of a subject before discussion, and R, the cri 
terion rank order. RY is the symbol of pry 
® Private effectiveness = pry Prx, the difference 
between the average accuracy of members after a dis 
cussion than before the discussion 


if the “philosophers” solving the 


kingdom’s problems. 


are proficient in 


Successful Leadership Related to Greater Es- 
teem in Effective Compared to Ineffective 
Groups 


By esteem, we mean the judged value of a person 
to his group regardless of his position in the group 
Bass, 1960, p. 277). A member’s esteem usually is 
positively associated with his ability (pp. 284-285) 
As just shown, a member's ability more 
with successful leadership in effective groups than 
in ineffective groups. It follows that esteem also is 
likely to associate more with success as a leader in 
effective groups than in ineffective groups. In other 
words, the successful-effective leader is expected to 
be more 


correlates 


esteemed than the  successful-ineffective 
The coach who guides his football team to 
victory is the hero; the coach is hanged in 
effigy. 

The 51 ROTC cadet groups were divided into the 
25 highest in public effectiveness on all 1 
combined and the 25 lowest on the average in all 1 
liscussions by This pooling was neces- 
sary since esteem was based on opinions at the end 
of the last discussion only. One group, 
in effectiveness, was discarded 
tween esteem? and relative success as a leader (A 
was .42 for the 125 members of the 25 
groups while it was only for the 125 members of 
the 25 ineffective groups. For the 25 
difference between .42 and 
the 5% level of confidence 
again fit nicely with the 
successful-effective leaders are 
successful-ineffective ones 


leader 


losing 


discussions 
each group 


middlemost 


The correlation be- 
effective 


cases, the 
was significant 

The experimental results 

logical that 


more esteemed than 


expectation 


Unsuccessful or Aborted Leadership as a Con- 
sequence of the Discrepancy between Es- 
teem and Self-Esteem 


In Chapter 16 on conflicts in groups ass, 1960), 
we deduced that members whose self-esteem (F)* 
is greater than the their 
associates (E) will attempt more leadership but will 
succeed less. 


esteem accorded them by 


the higher a member’s self-esteem, the more likely 
he is to attempt leadership. The higher he is es- 
teemed by others in the group, the more successful 
he will be in his attempts to lead. It follows that 
if a member's self-esteem is much higher relative 
to his esteem, he will attempt leadership, but his 
attempts may be rejected by the members 
(p. 322). (Quoted by of Harper & 
Brothers) 


other 


permission 


7 Esteem was based 
tent a member’s remo. 
loss to the group. A member's esteem 
average rating assigned him by the other mem! 

1 was his 


the end of the last discussion. His self-esteen 


rating of himself on the same five-point scale (I 





Bernard M. Bass 


. 


A significant positive correlation of 32 was found 
between self-esteem and attempted leadership. The 
corresponding correlation was .19 between self-esteem 
and relative success as a leader among the 255 cadets 
On the other hand, esteem correlated 36 with suc- 
leadership but 42 with attempted leader- 
ship.* To contrast the “modest” and 
“immodest” cadets who attempted similar amounts 
of leadership, it was thus necessary to control the 
individual esteem, self-esteem 
attempted, and successful leadership. Therefore, 252 
of our 255 ROTC cadets were divided into 126 who 
were above the median in attempted leadership and 
126 who were lower in such attempts on the average 
during all 10 discussions combined. Then, for each 
subject, the extent others in his group 
esteemed him at the end of the last of the 10 discus- 
sions (E) from his self-esteem (F) 
A high (F-E) “im- 
modesty,” an appraisal by the subject that his self- 
esteem was greater than the esteem others accorded 
him. A low (F-E) a negative one implied 
“modesty” or appraising oneself as relatively less 
valuable in comparison to the group’s opinion. Each 
sample of 126 was subdivided into a subsample of 
63 “modest” and 63 “immodest” men. Thus, four 
subsamples of 63 men each were isolated, low or high 
in attempted leadership and modest or immodest 

Among the 126 above the 
attempted leadership, the 63 “modest” 


cessful 


success oO! 


covariance between 


average 


was subtracted 


discrepancy score suggested 


score or 


median in 
cadets earned 
\) of .37; the 63 “im- 
only 
the median 


cadets 


an average leadership success 
modest” earned an 
10. Again, among the 126 cadets below 
in attempted leadership, the “modest” 
more than the “immodest” although at- 
tempting similar amounts of leadership. The 
relative success of the “modest” men was 1 while 
ii was for the 
The appropriate 
found the 


cadets average success of 


cadets were 
successful 
mean 


‘immodest.” 

factorial 

greater mean successful leadership of .2 
men 
significantly 
mean 


analysis of variance 


£ 


of those 126 


attempted 


median in 
greater at the 1% 


success t 19 ol 


above the 
lev el } 
those 


leadership 
than the relative 
below the median in attempts, reflecting the positive 
correlation between attempted and successful leader 
ship. (One must 
succeed at it.) 


attempt leadership in order to 
Similarly, the cadets earn- 
ing scores of .18 significantly more successful 


(at about the 6% level) than the “immodest” cadets 


“modest” 
were 


8 These were averages of product-moment correlations 


obtained within nine samples at differing 


motivation and amounts of organization 


levels of 


attempting similar amounts of leadership who earned 
average relative leadership success scores of —.14. 
The interaction of “modesty” and attempts to lead 
was not significant. 


SUMMARY 


Theorems from Leadership, Psychology 
and Organizational Behavior (Bass, 1960) 
provided hypotheses for experimental tests. 
This paper reports experimental verification 
of the following: 


1. A highly significant correlation of .45 
was found between initial accuracy and rela- 
tive successful leadership in 255 more effec- 
tive discussions, while the correlation 
only .07 in 255 less effective discussions. 

2. A correlation of .42 was obtained be- 
tween esteem and relative successful leader- 
ship in 25 groups with higher average effec- 
tiveness on the 10 problems, while the 
was .22 in the 25 less effective 


was 


correlation 
groups. 

3. Those men whose self-esteem outweighed 
their exhibited a mean 
leaders of —.14 while those whose esteem was 
higher than their self-esteem earned a suc- 
cessful leadership score of .18. Significant 
effects emerged when the differences in at- 
tempts to lead were controlled. 


esteem success aS 
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CUMULATIVE COMMUNALITY CLUSTER ANALYSIS 
OF WORKERS’ JOB ATTITUDES 


ROGER HARRISON ! 


Procter and 

A recent paper by Wherry (1958) examines 
the similarity of factorial structures obtained 
from four studies of the SRA Employee In- 
ventory. Wherry (1954, 1958) concludes that 
the factors found in the four studies are quite 
similar, though the claim has been disputed 
(Baehr, 1956). Wherry named a general and 
five group Working Conditions, 
Financial Reward, Supervision, Management, 
and Personal Development. 

This paper extends the investigation of the 
reliability of dimensional analysis to another 
paper-and-pencil job attitude questionnaire, 
using a method of analysis different from 
those applied in the studies cited by Wherry. 
Tryon’s cumulative communality — cluster 
analysis was applied to the intercorrelations 
of the items from job attitude questionnaires 
administered to two industrial work groups. 
The clusters found are compared with the 
factors named by Wherry. 


factors: 


PROCEDURI 


Samples. All 
Plant A, a medium 
were administered a job attitude « 
hourly paid men (N 
manufacturing plant, were administered a 
questionnaire. Plants A and B mad 
using similar 


hourly paid \ 5 from 


sized manufacturing 


men 
installation 
All 


650) from Plant B, a large 


yuestionnalirs 


similar 
e similar products 
processes 

Each of 
questions 


the twe 


The questionnaire 
contained about 106 
questionnaire were ol 
be selected for factor 
were common to the 
the 
questions 


juestionnaires 
ry 


in each 
to 


which 


t 
oO 


sufficient il interest 


gener 


analysis. Sixty-eight question 
two questionnaires. The present 


study focuses to 
latter 
analyses 

All items were 
to the 


they 


on ways in which responses 


these were clustered in the two 


written in such a w 
spect of their 


on a 


iy as to present 
subjects an a 


to 


job situation which 


were rate five-point scale. The points 
1 Now at Yale 
The author wishes 
Robert C. Tryon of 
Berkeley) for 
Bailey the 


analyses 


University 
to express his 


the 


appreciation to 
of California 
to Daniel E 
carrying out the 
carried out the 


University 
suggestions and 
institution for 
Beverly A. Veatch 
blind clustering of the variables 


his 
ol Same 


tactor 


1 


? 


Gamble 


2 
4 


( 


ym par 


the scale 
with the job 
ranging from “very 
to “very favorable.” 
The factor The responses 
re dichotomized so as to give near to a 
split as possible high and low 
Tetrachoric computed among 
items, using a program developed for the IBM 
by H. W. Garrison and M. Charap of Educational 
Testing Service. Tryon’s cumulative 
cluster analysis was applied to the 
using an IBM 1 
and J. O. Neuhaus 
Tryon, 1958) 


on constituted [1 
aspect referred 


unfavorable,” 


questior 


through “neutral 


analysi to the item 


wi as 5 


between groups 


the 


correlations were 


communality 
intercorrelations 
written by D. E 
University of C 


program Bailey 


of the 


Cumulative 
with 
} 


ot 


analysis 
factor an 


selection of 


communality clus 
a multiple group method 
jective criterion the 
ptogramed, resulting in the selection of 
in the matrix with the highest variance of 
correlations a pivot variable 
Other whose profiles 
matrix to 
grouped with it, 
through the centroid 
repeated on each 
method of 
which are 


ior 


is ior each 


in the 
variable 


of correlations 
of the 
first factor 
The 


residual 


items 
those 
tk 


of this 


are similar pivot 
are and 1€ iS passt 1 


group pr CESS is 
This 


succeeding 
results 
simple structure 


matrix 
orthogonal 
That 


that each item is likely 


factoring in factcrs 
close to 
factors are so located 
high loadings on 

The program 
at least 97.5% of 
has been accounted for by 
on the 
communality estimates 

Following the multiple group factor an 
uped into the 
factor The 


R 


only a few factors 


continues t 


the common variance 


the loadings 


factors. The program reiterate 


were ro clusters on basis 


loading 


Tryon 


their 


1 by 


terns ol! 


suggestec ae in 
cation 

A table 
of the variable 
of the factor dimensions 
ordinates obtained 
factor loadings by the 
Items 
of 


lin ‘ 
linates 


tl { coor! 
domains (variab 


which showed 


le vectors) h 
constructed. The co- 
dividing each of the 
root the variable’s 
included if 
similar, 
the 
tems with 
wer ‘cluded 


on eat 
was 
were by 
square of 
communality 
profiles 
if their 
space defined by the 
communalities 
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The clusters 
using only the coordinates 
The analyst did not know 


ables from the 


wert in a cluster 


were idged 


coordinates 
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orthogonal 


their 


i.e., vectors similar directions in 


lactors 
low below 4 from 
by blind analysis, 
of the variable domains 
the item content. Vari- 


two questionnaires were clustered 


were constructed 
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separately and independently. At the conclusion of 
the blind clustering, there were a few variables (5 
to 10) about which it had been difficult to make 
a clustering decision. Such variables were finally 
clustered using the item content as a guide to 
choosing one of two possible clusters. 


RESULTS AND DISCUSSION 

The clusters of items arrived at in the 
two analyses are quite similar. Table 1 is a 
joint frequency distribution of variables in 
the clusters from the two plants. It shows 
how items which were clustered together in 
one analysis tended to be clustered together 
in the other. 

Table 2 lists the job aspects referred to 
by the items in each cluster in the two 
analyses. There are 12 distinguishable clusters 
in the Plant A analysis and nine in that of 
Plant B. It appears that Plant A employees 
made finer distinctions than did the workers 
in Plant B. Some items which are separated 
into two clusters in Plant A tend to be 
grouped in the same cluster in Plant B. For 
example, this is true of items dealing with 
attitudes toward one’s foreman. The same 
items form three clusters in Plant A; they 
are in two clusters in the Plant B analysis. 


Roger Harrison 


The kinds of variation represented by the 
clusters in Table 2 are quite similar to those 
discussed by Wherry (1958) in his summary 
of four factor analyses on the SRA Employee 
Inventory, and the clusters found in the 
present two analyses may easily be grouped 
under the headings from Wherry’s summary. 
The lack of a general factor in the present 
study is, of course, an artifact of the choice 
of method of analysis. The correlation be- 
tween the cluster domains (vectors) ranges 
from .16 to .94 with a median of .51 in the 
analysis of Plant A, and from .31 to .84 with 
a median of .56 in the analysis of B. Clearly, 
a general factor could have been extracted 
if desired, since the cluster domains are all 
positively correlated with one another. In the 
present study, since the objective was to 
study patterns of response rather than to 
search for underlying independent factors, a 
general factor was not extracted. The signifi- 
cant finding is that whether one looks for 
underlying dimensions of job attitudes, as in 
the Wherry rotation, or whether one seeks to 
group items according to similar patterns of 
response as in the present study, the same 
kinds of variation emerge. 


TABLE 1 


JotmntT FREQUENCY DISTRIBUTION OF VARIABLES IN THE Two CLUSTER ANALYSES, BY CLUSTERS 


Plant B Clusters 


Plant A 


Clusters . : ; 6 


11 

12 
[ja 
Ad 
Be 


Total 
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t A, but under 

re not admir 
nder 


but u 


tered to Plant 
40 in Plant B 
iistered to Plar 
40 in Plant A 





Cluster Analysis of Workers’ Attitudes 


TABLE 2 


DESCRIPTIONS OF 


Plant A 
Cluster and Description 
1. Consideration and helpfulness 
of the foreman 
Foreman’s competence in admin 
istering the department 
Foreman’s competence in super 
vising the job 
Physical working conditions and 
nervous tension on the job 
Advar cement 
Employee plat 
Earnings 
Administrative ce 
higher management 
Higher managen 
ation for employ) 
Friendliness and f 
management 
Importance of or 
Facilities: medical 


locker rooms 


There is thus some reason to believe that 
the kinds of variation in job attitudes identi- 
fied by Wherry are not specific to the ques- 
tionnaire used nor to the method of dimen- 
sional analysis. 
of the present 
Several recent 
aspects variously interpreted as 
tion” or as opportunities for advancement, 
recognition, and achievement, are central to 
managers’ evaluations of their jobs (Herz- 
berg, Mausner, & Snyderman, 1959; Schwarz, 
1959). Appropriate questions for evaluating 


Some caution in interpretation 
results is, however, necessary. 
studies have shown that job 
‘self-realiza- 


aspects of workers’ jobs were not 
included in the 
these results are not directly comparable to 


have 


these 
present questionnaires, so 


studies which found such factors 
important. 


SUMMARY 


Job attitude questionnaires were adminis- 
tered to two groups of industrial workers. The 
questionnaires were analyzed separately for 
the two groups by Tryon’s cumulative com- 
The 


at by blind analysis, were 


munality cluster analysis. resulting 


clusters, arrived 


compared with each other and with the fac- 


THE CLUSTERS IN EAcn ANALYSIS 


Plant B 
Cluster and Description 
1. Consideration and helpfulness of the 


foreman and his competence in 


departmental administration 


Foreman’s competence in super 
vising the job 

Physical working cor 
Advancement op] 


Employee plans ar 


Administrative « 
consideratior 
higher mar 
Friendlit 
managemer 
Willingness of 


lor safety ar 


tors summarized by Wherry. The two sets of 
clusters were quite similar. 

Kinds of variation indicated by the clusters 
were similar to those found by Wherry in his 
summary of four analyses of the SRA Em- 
ployee Inventory. 

It is concluded that these kinds of varia- 
tion may be common to a wide variety of 
industrial jobs and that they do not depend 
strictly on the questionnaire used or on the 
method of analysis. 
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FURTHER EVIDENCE OF A PRACTICE EFFECT 
ON THE MILLER ANALOGIES TEST 


ROBERT M. COLVER 


anD CHARLES D. SPIELBERGER 


Duke University 


In a study designed to test the hypothesis 
that Miller Analogies Test 
(MAT) improved with practice, Spielberger 
(1959) found that scores were significantly 
higher on retest with an alternate form of the 
MAT for each of three independent samples 
of Ss. But since all of the Ss in these samples 
were either graduate students or senior under- 
graduate honor students in psychology, it was 
suggested that the observed practice effect 
might be limited to “bright, psychologically 
sophisticated students” who might profit more 
from practice than less intelligent, less sophis- 


scores on the 


ticated Ss. The present study was designed to 
evaluate the effects of practice on the MAT 
for senior undergraduate Ss with less psycho- 
logical sophistication and lower mean intelli- 
gence than characterized the Ss in the pre- 
vious study. 

METHOD 

alternate MAT, 
administered in a counterbalanced order to 36 
liberal arts seniors enrolled in a 
psychology for nonpsychology 
fields of study of the Ss were 
languages 

ligion (1), 


Forms G and H, 
were 


forms of the 


in educational 
majors. The 
English (8), 
>), political science (35), 
mathematics (1) and 
8). None of these Ss had had previous experience 
with the MAT. The tests! were administered under 
standard group conditions by a MAT testing center 
director except that the Ss were specifically informed 
that the results would be used for 


course 
major 
foreign 
history (12), re 
science-education 


research purposes 
Also, in order to make motivation more comparable 
to Ss taking the MAT for some official purpose, they 
were told prior to their initial experience with the 
test that the results could be used for individual 
counseling if such counseling was requested. Seven- 
teen Ss (Group I) were first given Form G of the 
MAT followed by Form H. Nineteen Ss (Group IT) 
were given these same forms in reversed order 


RESULTS 


Prior to analyzing the data, Form H raw 
scores were made equivalent to Form G scores 

1 The cooperation and interest of Harold Seashore 
and the Psychological Corporation in making the 
MAT available for this gratefully ac- 
knowledged 


research is 


by adding two points to all. Form H scores in 
the 30 to 70 score range (Miller, 1952, p. 6). 

The means and standard deviations for the 
MAT 
are presented in Table 1. The mean score ob- 
tained by the Ss on their second experience 
with the test was significantly higher than the 
mean score for their initial performance (¢ 

5.66, p < .001). Of the 36 Ss, 27 improved. 
The consistency of the improvement was fur- 
ther indicated by the Pearson r between ini- 
tial and final scores of .86. 

Since a lack of equivalence between alter- 
nate Spielberger’s 
study, the data were further evaluated by an 
analysis of variance (Lindquist, 1953, Type 


two successive administrations of the 


forms was suggested in 


II design) which tested the equivalence of 
Forms G and H as well as the effects of prac- 
tice. This analysis was possible since the 


alternate forms had been given in counter- 


balanced order. This analysis also provided a 


more precise test of a practice effect in that 
the ¢ test for related measures used to eval- 
uate the difference between the means in 
Table 1 was based on an error term inflated 
by variance attributable to the use of alter- 
nate forms of the MAT. 

The means and SDs for Groups I and II 
are presented in Table 2 where it may be 
noted that Form G appears to be somewhat 
easier than H. The F test of the effect of 
form, however, was not significant (F 
p > .05). The F test of the effect of practice 
was highly significant (F = 34.29, p < .001). 


2.00, 


rABLE 1 


['wo SuccEssIvt 
MILLER 


VN = 36 


MEANS AND SDs FOR \DMINISTRA 


TIONS OF THE ANALOGIES TEST 


First Test Second Test 


Mean Mean SD 


56.30 62.00 





Practice Effect on the MAT 127 


rABLE 2 
AND SDs ror Forms G AND H or 
\NALOGIES 


MEAN 
Test GIVEN IN A Cot 
BALANCED ORDER 


17 in each grou 


Form G Form H 
SD Mean SD 


10.9 61.88 


13.1 53.94 12.8 


Cont LUSION 


This study was designed to determine 


whether the practice effect on the MAT found 
by Spielberger (1959) was due to the 
that his Ss were bright, psychologically 


fact 

SO 
phisticated students. In the present study, 27 
of 36 liberal arts students (757) improved 
their scores and the mean increase in score for 


the group was highly significant. This result 


was similar to the findings of Spielberger’s 
study where 39 out of 48 psychology students 
(81%) improved and where the magnitude 
of improvement was also highly significant. 
Thus, it would appear that the observed prac- 
tice effects on the MAT are not limited to 
psychology students. The findings of this 
study support the hypothesis that scores on 
the MAT improve 


experience with an alternate form of the test 


as a function of previous 


and that this improvement is not limited to 


bright, psychologically sophisticated students 
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Woodworth (1938), in reviewing the litera- 
ture concerned with esthetic preferences for 
geometric figures, noted that the results of 
these studies may be confounded by the cen- 
tral tendency effect, i.e., the tendency for the 
subject to prefer figures toward the center of 
the range presented to him. For example, a 
study by Witmer is summarized in which it 
was found that when subjects were presented 
with various graduated series of isosceles tri- 
angles, the particular triangle most preferred 
was the one in the middle of the series. How- 
ever, when the range in the series became too 
extreme the central tendency effect was not 
obtained and subjects preferred figures to- 
ward the less extreme end of the series. 

Austin and Sleight (1951), pursuing the 
problem, believed that the effects of this cen- 
tral tendency bias could be eliminated by 
presenting the figures to the subject by the 
paired-comparison method. The procedure 
they used was as follows. Twelve triangles 
(altitude-to-base proportions ranging from 
\” xX 1” to 3” x 1”, by }” altitude steps) were 
combined in all possible combinations of two 
(66 pairs in all) and mimeographed on sepa- 
rate sheets. The order of each pair of triangles 
on the pages, as well as the pages themselves 
in the booklets, were randomized. 

The results of that study are shown by the 
solid line in Figure 1. Triangles with altitudes 
between 1 and 2 inches were most preferred. 
Is this preference for figures in the center of 
the range dependent or independent of a cen- 
tral tendency effect? Austin and Sleight con- 
cluded a central tendency effect was not oper- 
ating since the figures had been exposed two 
at a time in a random sequence. 

The purpose of the present study was to 
explore the problem a step further by essen- 
tially replicating the Austin and Sleight study 
with one basic change. Since Austin and 

1 Grateful acknowledgement is made to R. Mathias 
for providing subjects and to R. Spence for perform- 
ing the calculations 


Sleight had used only one series of stimuli in 
their study, there was no basis for inferring 
whether preferences were or were not de- 
pendent upon a series effect. In the present 
study three series of stimuli were used corre- 
sponding to the lower, middle, and upper 
parts of the range used by Austin and Sleight. 
More specifically, Series I consisted of 12 
isosceles triangles ranging from }” X 1” to 
13” x 1”, by 3” altitude (Since all 
triangles in the original study as well as in the 
current study had a 1 inch base, we will refer 
henceforth to triangles in terms of their alti- 
tudes.) Series II and III consisted of 12 tri- 
angles each and ranged from {” to 2}” and 
from 12” to 3”, respectively. Three sets of 
booklets were prepared each containing one of 
three series of figures. The three types of 
booklets were distributed randomly to a class 
of 78 nursing students taking an introductory 
course in psychology. One-third of the group 
responded in booklets containing Series I fig- 
ures, one-third to Series II figures, and the 
remaining to Series III figures. 

The results are presented in the three 
broken line curves in Figure 1. It is apparent 
from inspection of the curves that the prefer- 
ences are, despite the use of the paired-com- 


steps. 
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parison method, not free of a series effect. A 
preference for figures in the middle of the 
range is seen in the Series I and II. The ab- 
sence of the central tendency effect in the 
Series III data is consistent with Witmer’s 
finding that the central tendency effect was 
not found when the range in the series became 
extreme. The central tendency bias can hardly 
be said to have been “minimized” let alone 
eliminated, when one finds such marked dis- 
crepencies in preferences for a particular tri 
ingle when it appeared in the different series 
For example, the 1}” triangle is among the 
least preferred when it appeared in Series | 
but among the most preferred when it ap- 
peared in either the original Austin 
Sleight series or in Series I] 

On the basis of these must be 
concluded that Austin and Sleight’s use of the 
paired-comparison method did not eliminate 
the central tendency effect from their data. In 


and 


results it 


the course of making their initial judgments 
subjects apparently did acquire some knowl- 
edge about the range of stimulus figures in 
volved and did persist in showing preferences 
for figures in the middle of the range 

While i 
effect did operate in the 
study 


is clear that the central tendency 
Austin and Sleight 
it is not clear under what particular 


y 
1 
1} 


circumstances the central tendency bias could 
be expected to be a problem in other prefer- 
ence studies. One cannot help but speculate 
that it might occur in any situation where the 
subjects have no preferences or only very 
This could 


Faced 


weak preferences for the stimuli 


well be the case for isosceles triangles 


Judgment 129 


with the task of expressing preferences when 
they have none, subjects may their 
dilemma by simply choosing the least extreme 
of the stimuli presented to them. 


resolve 


There is no reason to believe that this bias- 
ing effect would be limited to studies involv- 
ing esthetic preferences For Hal! 
and Bennett (1956) investigated the optimal 
diameter of handrails of public stairways by 


example 


having subjects express preferences for diam- 
eters of 1.5, 1.75, 2.00, and 2.50 inches. The 
curve they obtained from this study has the 
same general shape as Austin and Sleight’s 
ind the possibility must be considered that 
the subjects were only exercising a central 
tendency bias in the absence of preferences 
for the stimuli as such. 

Two solutions to the problem come to mind 
One, as in the present study, several parts of 
the stimulus range can be explored separately 
to determine if preferences for the stimuli as 
such override a possible series effect. Two, by 
having different subjects express preferences 
\ ilue no series 


for each individual stimulus 


effect could oper ite 
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DESIGN AND INTERPRETABILITY OF ROAD SIGNS 
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Ohio State 


A standardized series of road signs is now 
in use in the Western European 
countries. The signs have received consider- 
able attention in this country, partly due to 
their uniqueness and partly due to their ap- 
parent ease of interpretation. The signs make 
minimal use of language, attempting instead 
the information through 
pictorial and symbolic representation. 

The road signs used in the United States 
are often ambiguous, require considerable 
time to interpret, depend primarily on written 
language, and lack standardization. These 
factors lead to incovenience, travel 
time, and may contribute, directly and in- 
directly, to accidents. It thus 
matter of importance to investigate means for 
improving the interpretability of United 
States road signs 

It was the purpose of the present study to 
determine how well the European signs could 
be interpreted, and to relate these findings to 
sign preferences 


some of 


to convey desired 


loss of 


becomes a 


(stereotype). More specif- 
ically, the aim of the study was fourfold: 
a) to investigate the interpretability of the 
European road signs; (b) to determine it 
appropriate to 
to determine if 
which are 
the 


stereotypes exist for signs 
highway use, and if so; (c) 
characteristics 
the 


easily interpreted European signs; 


general 
both 


there are 
and 
and (d) to 


increased 


common to stereotypes 


test the effectiveness, in terms of 


interpretability, of signs based on ste 


reotypes 


\IlETHOD 


The study wa ynducted in 
two phases employed different 


mining the interpretability of 


signs prior to 


This resear 
of Aviation Psycholo 
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University 


their shown in Table 1. The third 
phase was an attempt to assess the effect of limited 
experience with the signs on their interpretability 
The fourth phase was designed to discover the stereo- 
types for sign meanings. The fifth phase was con 
cerned with determining the interpretability of signs 
derived from Phase IV, ie., the interpretability of 
signs based on the stereotypes. 

Phase I. Thirty of the European 
duced on 12” 12’ display 
singly. Each sign was displayed to a 
ior a period ot oO sec. The Ss were 
the meaning which they thought the sign conveyed, 
within the display interval. At the end of 
The Ss 


chology 


meaning, are 


signs repro 
were presented 


of Ss 


cards) 
group 
asked to write 


each 
interval a new sign was shown were 2° 
students in an introductory ps) 
Ohio State University 

An interpretation correct if 
three Es independently judged S’s response to convey 
the the European sign meanings 
The interpretability the 
of correct interpretations 

Phase II. The same signs 
Phase I. However, in this phase 
with an answer containing a list of the sign 
meanings. They choose the list 
the one meaning which best matched the sign being 
The Ss were intro 
psychology 


course 


scored as the 


was 
same meaning as 


score used was percentage 


presented as in 
Ss were supplied 


were 


sheet 
were asked to irom 


shown tudents from another 
ductory 
The 
matchings 
Phase Ill. ° yhase 
II and us 
following 


course 


score used wa the percentage of correct 


Phas 
Immediate] 
Phase II), the 
presented to Ss, but this time their 
orally by E. Following the 
used in Phase I wa 
were obtained 
new Ss 


immediately followed 

group OF 36 Ss 
the itching test sign 
were again 
meanings 
single 
repeated and similai 


Phase IV. In ; phase, 31 


with blank sheet with 


were given 
reading, the method 
scores 
wert 
papel space for 
page. Sixteen sign 


it time, at min 


meanings 
intervals 
from the 


ings per 
iloud, one 
sign me original 


inings were derived 


European signs and included the meaning of Sig 
sos ¥% ms. 36, iv. l 

>» ' see Table 1). In three it 

ame or similar 

1 and 


was 


meaning 
were p 

eaning given 
the sign 
appeared desirable 


hould be ch 


meanings to be usec 
that the 
that the 
found in the pi 


osen so 


entire range tability 
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ceding phases, would be sampled \ 
sideration was the applicability of the sign meaning 
to the American highway 

The Ss’ 
interval, a sign 
desired 


second con 


scene 

within the 2-min 
would convey the 
interpreted. They 
that 
could be 


task was to drav 

which they felt 
meaning and be 
told that no 
and/or 


easily 
words could be used, but 
outline 


employed as coding 


were 
color shape of the sign 
dimensions 
Drawings were classified by the Es. The 
of drawings which corresponded the 
signs determined, as was the 
alternative 


$1), Ss were 


number 
European 
was percentage ol 


major sign designs. In one 


instanc¢ 
Sign unable to draw a sign compa- 
rable to the European sign because the use of words 
was not permitted 

Phase V. In this 
i” 12” display 
the drawings of Phase I\ 
used. Experimental conditions and scoring procedures 
were the same as in Phase I. TI igns involved 
in this ] 


final phase 1 sign drawings 


cards were constructed from 


I'wenty-nine new Ss wert 


phase are shown ir Table 


RESULTS 


Interpretability. The percentage of correct 
in Phases I, II, and III is 
Table 1 which 
difficult to interpret in Phase I were easily 
interpreted in Phrase II (e.g., No. 19, 27, 28, 
29). but the which 

interpreted in Phase I were likewise 
interpreted in Phase II. A 
correlation 


responses pre- 


sented in Some signs were 


in general signs were 
easily 
easily Pearson 
produc t-moment coefficient be- 
tween the first two phases was computed as 
606, which is significant at the .01 level 

There fewer correct 
Phase I than in Phase II, an 
versus 74°% correct responses, respectively. In 
Phase III (after Ss 


sign meanings) the average interpretability of 


were responses in 


average of 54 
were given the correct 
the signs approached 100 
Stereotypes. Also presented in 
under “American Stereotypes,” is 
of the 
These 


fable 1, 
a summary 
Phase IV 
into three 
major categories. The first category included 


signs drawn in 


types of 
drawings were separated 
those drawings which were directly compa- 
rable to the corresponding sign 
The 


drawings which were comparable to either the 


European 


second category was composed of the 


corresponding European sign, but with some 
or to one of the 


Che se 


indicated by referring to the 


element added or subtracted, 


other European signs instances are 


number of the 
comparable European sign plus the notation, 
element which 


when applicable, of the was 


added or subtracted. The third category in 
cluded drawings which were distinctly dif 
ferent from any of the European signs. Ir 
these instances a 
drawing is given. 

The results of Phase IV that 
stereotypes for many of the signs appeared in 
the drawing responses. Thus, 
sign meanings (No. 27, 7, 18) led to drawing 
which were sufficiently similar that the majo 
characteristic of was found in 
100% Signs No. 20, 15, 10 
is 3 and 13 likewise produced good agree 


brief desc ription of the 


indicate 


three of the 


the drawing 


of sign designs. 


ment among Ss, one major characteristic ap 
pearing from 45 to 81 of the time 
Moderate amounts of agreement were found 
in the remainder of the signs, and in many 
instances those characteristics which Es have 
isolated as actually have 
imilarities. Although Ss were given the 
portunity to use 


separate some 


color and sign 
coding dimensions, insignificant use was made 
of them. 

The 10 signs which were constructed 
the results of Phase IV are presented 
Table 2. Included in this table are the pet 
centages of correct interpretations of these 


Phase \ 


interpretability scores for the European signs 


signs, found in together with the 
meaning: the 
Phase I unde 
experimental conditions identical to those of 
Phase V. It can readily be 
interpretability of the 
stereotypes is superior to 


which correspond in intended 


latter scores were obtained in 


seen that the 


based on the 


signs 
that of the co 


responding European signs in all instance 


The interpretability score for 7 of the 1 
stereotype-based signs was significantly highe1 
(p 05) than that of the corresponding | 

ropean sign(s). However, four of the stere: 
type-based signs had a relatively low inter 
pretability index (76, 


Relation 


59. 59 


. and 38 


between interpretability and 


stereotypes. Comparison of the findings for 


the stereotypes with their counterparts ir 


Phases I and II provides the data relevant 
to the third aim of the study, the relation 
between stereotypes and the interpretability 
of the European signs 

In general, as percentage ot correct inter 
pretations in Phases I and II increased, the 


number of drawings of similar signs in Phase 
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IV also increased. In Phase I, the 
(below 15% 


signs with 


low scores correct) were gen- 
erally those which made use of some abstract 
slash 
The 
correct ) 
were characterized (a) by direct 
counterpart in the American road sign system, 
and/or (0) by being a direct pictorial repre- 


coding dimension (e.g., circle and/or 
line to denote a prohibitive action) 
signs with high scores (above 85% 


having a 


sentation of the sign meaning. Several signs 
which were easily interpreted in Phase I, but 
were not included in Phase IV 
direct pictorial representations (e.g., children 
In Phase IV, when 
the European sign was not drawn (or could 
not be drawn as was the case for the STOP 
sign), the general characteristics of the signs 
those 


also were 


crossing, bridge opening). 


which were drawn were the same as 


present in the easily interpreted European 
signs 

The most explicit evidence for the relation 
interpretability 


It can be 


between stereotypes and 
comes from the results of Phase V. 


] 


clearly 


Table 2 that all 10 of the 


stereotype-based signs were easier to interpret 


seen from 


than their European counterparts; the aver 


age interpretability score for the former signs 


was approximately 75% as compared with 


15° for the European signs 


DISCUSSION 


rhe results indicate that the most readil\ 
nterpreted signs fall into two major cate- 
gories: (a@) signs employing directly pictorial 
children 


direct 


representations (e.g., road narrows, 


and (bd) having 
American 
Signs in both of 


( rossing ) ; signs 


counterparts in existing signs 


(é.g., STOP, RR crossing). 
these categories have the major feature of 
being unambiguous; the former inherently so 
ind the latter through the experience Ss have 
had with them 

Difficulties in interpretation appeared when 
unfamiliar and/or ambiguous coding dimen- 
sions were used. For example, the two codes 
used in European signs to indicate prohibited 
action, circle and slash, were not immediately 
clear. The 
fact, in many reversals (e.g., left turn for no 
left turn, motorcycles permitted for no motor- 


] 


( eS 


use of these two codes resulted, in 


permitted, and vehicle permitted for 


‘ampbell, and E. H. Elkin 


TABLE 2 


RESULTS FROM PHASE \ 


Percentage of Corre¢ 


Interpretatior 


Corresponding 


| uropean Sig! 


no vehicle permitted). The Ss apparently did 
not recognize the presence of the circle as a 
code, and the slash was often misinterpreted 
(e.g., as being an overpass) or ignored. 
The pictorial signs that were used in this 
study were generally unambiguous. Most of 
the easily interpreted pictorial signs were of 
1 unidimensional character, employing a pic- 
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ture or symbol without the additional coding 
dimension of the circle and/or slash. It ap- 
pears then that the use of additional symbolic 
dimensions tends initially to confuse the sign 
interpreter. 

As opposed to the purely pictorial signs, 
symbolic signs do not seem to lend themselves 
as well to an immediate, correct interpreta- 
tion. The symbolic signs which were inte! 
preted easily in this study had counterparts 
in the American road system. The two sym- 
bolic signs which had no counterpart in the 
American road system (Signs 11 and 17) were 
never interpreted correctly in Phase I 
were only interpreted in 
Phase II. 

One of the major questions raised by the 


and 


moderately well 


present study involves the relation between 
the stereotypes as demonstrated in the draw- 
ings, and the interpretability of 
measured in the early phases. 


signs as 
Do the stereo- 
types incorporate those features which make 
for ease in interpretation? The authors hy- 
pothesized that they would. Thus, we would 
expect to find some common basis underlying 
the easily interpreted sign and the stereo- 
type. This hypothesis is partially substan 
tiated by the results of Phase IV. The signs 
which are most difficult to interpret in Phase | 
and Phase II were rarely. if ever, drawn by 
Ss in Phase IV. On the contrary 
which were more readily interpreted in the 


those signs 


early phases were drawn more frequently in 
Phase IV. In addition, the 
that where the stereotype did not agree 


1 


results showed 
with 
draw n 
| 


the European sign design, the 
still tended to reflect 
were characteristic 


signs 


those features whi 


1 


of the easily interpreted 


signs. When the drawings were compared with 


their hard-to-interpret European  counte! 


parts, for any given meaning, the drawings 


generally substituted a pictorial figure or 
familiar symbols for an abstract or unfamiliar 


symbol (e.g.. :  % 
picture of a 


across a sign, or a 
hand was substituted 
for a circle or slash) 

The results from Phase V emphasize even 
more directly the effect on inter- 
pretability when signs are based on a demon 
strated stereotype 
for the 1 
erally 


beneficial 


the interpret ibility score 
stereotype-based signs was gen 
than for the 1 


considerably highe1 


corresponding European signs. Furthermore 
there were no instances of interpreting a sig! 
to mean the very opposite of its intended 
meaning; such reversals were frequent among 
some of the European signs. On the other 
hand, the last four signs Table 2 
were interpreted only moderately well, 

though they were based on a 


shown il 


stereotypi 


level comparable to the remaining 
which yielded high interpretability scores 


appears, therefore, that a moderately 
e.g 3 4 


degree of consensus 


sign designers is not always sufficient 
basis for designing highly interpretable signs 
It must be remembered that the above 


results deal primarily with immediate inter 
Phase II] 


showed that after Ss were told only once what 


pretability of signs. The results of 
the signs meant, the interpretability of most 

This suggests that 
signs having little or no correspondence 
with the stereotype maj still be 
correctly by almost everyone who has 


signs approached 10( 
some 
interpre ted 
small 


amount of experience with 


the sign. A very 


different result might be 


found. of course. if 


given to an old sign for 


new meaning were 


which a strong stereotype existed 


SUMMAI 


The purpose 
investigate the 


European road 


stereo 
tvpes existed for signs. t pare the get 
eral characteristics 
with the 
stereotypes, and 


uropeat 
characteristi embodie 
based signs enha interpretabi 
Interpretability was investigated 
different Phase I S 
meaning thought a 
veyed, and in Phase IT they chose 


methods. In 


which the \ 


of possible meanings the one mea 
best matched the particular sign being show! 
In Phase III, the 


pated in Phase II were told the meanings of 


same Ss who had partici 
the signs 
and Ss 
thought the sig 


Then the signs were presented agail 
wrote the meaning which they 
Phase IV invest 


1 con 


conveyed 
vated the stereotypes for roa 
phase. sign meanings wet 


they designed signs whicl 
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meanings. Stereotype-based signs were con- 
structed from the results of Phase IV; the 
interpretability of these signs was determined 
in Phase V. 

The results of the study can be summarized 
as follows: 


1. Interpretability of the European signs 


was partly a function of the method by which 


The 


Was con- 


interpretability was examined mean 
interpretability score from Phase | 
siderably lower than for Phase II, although 
the correlation between the two methods was 
significant 

2. The European 
well interpreted on first presentation; after 
meaning, inter 


were moderately 


signs 


one exposure to the correct 


pretability approached 100% 
interpreted 


3. The easily European signs 


were generally pictorial representations of the 


‘ampbell, and E. H. 


kilkin 


sign meanings or were counterparts of Ameri 
The signs which were difficult 
unfa 


can road signs 
to interpret generally 
miliar symbols or included ambiguous cues 

4. Stereotypes for road exist 
The found in the 
stereotypes those in the 
easily interpreted European signs. 


used abstract, 


some signs 

general characteristics 

were the same as 

5. Interpretability is enhanced if signs are 
However, 

stereotypes of only moderate strength (30 

40° ) will not always be highly interpretable 


stereotype-based signs based on 


6. A small number of the European road 
signs could be efficac iously used in the United 
States, without necessitating prior instruction 
majority of the 
without a 


meaning. The 
could not be 


their 
signs, however 


as to 
used 


minimal degree of familiarization 
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